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Hie stupy here reported is a moditied 
experiment, in which a group 
of expert clinicians attempts to identify sub 
jects by TAT analyses alone. 


Blind intet 
pretations of TAT records from 36 adoles 
cents are 


systematically compared with 
(similarly blind) interpretations of selected 
criterion data. 


These criterion data include 


objective test scores, summaries of observa 
tions and interviews, and Rorschach tests. 
The interpretations of all of these criterion 
instruments, as well as the TAT interpreta 
tions; were done by clinicians who had not 
administered the 


and who had no 
background information about their subjects 


other than age and sex 


tests, 


In order to make 
subsequent comparisons easier, all instru 
ments were interpreted or summarized 
within the same conceptual framework, pre 


viously developed by the entire team of re 


searchers participating in the study 


The experiment follows the general out 


lines proposed by Cronbach (1948) tor pro 
jective validation. The method permits com 
parison of the total projective analysis with 
other complex data and, at the 


samc 


time, 
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minimizes judge bias 


TAT analysis is 
“prediction,” and matched 
against statements pertaining to a particular 


treated as a 


‘area’ of personality function, which are 
randomly selected from the criterion data of 
three Modifica- 
in particu 
analysis of 


of the adolescent subjects 
tions in the Cronbach method 

lar, the use of an variance 
isolation of the intluences oft 
components of the 


design—permit 


many judging process. 
Statements can be derived about the differ 


ential effects f the various behavior 


‘areas, the effects of particular subjects, 
order and sequence of presentation to the 
judges, criterion instruments, ete., as well 
as manv of the interaction effects of these 
tactors 

Matching accuracy was computed in two 
ways, to represent the two tasks required of 


the judges: that of identifying true 


PAT prediction 
was being compared and that of discrimi 
nating the 


items 
about the subject whose 


items taken from some other sub- 


ject’s criterion data. These scores are 


sepa 
itely analyzed as “I (identification) and 
won SCOTES respectively, 


Is the pilot or the full 
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METHODOLOGY OF THE STUDY 


The study is essentially a research on in- 
dividuals, having the particularity and the 
time span characteristic of a case report. It 
the 
depth sampling which is found primarily in 
intensive examination of a single case; that 


is, it assesses the full range of p 


attempts also comprehensiveness 


rsonality 


levels from over: behavior throuch the 
directly underlying motivational structure 
of current needs and press to the basu 


structural underpinnings of the personality 
and the dynamic actions and interactions of 
How 


through 


basic drives and defensive patterns. 


ever, we have attempted also 


itemization of criterion data—to make us« 
of the greater precision inherent in the trait 
method validation, 
comprehensive approach to the total person 


ality. 


of while preserving a 


In terms of the particular subgoals 
served, we have represented most explicitly 
the “predictive” and the “concurrent” types 
of validity (American Psychological Asso 
ciation, 1954). The predictive aspect de 
rives from the fact that projective predic 
tions are made from responses to a single 
TAT administration, analyzed blindly, and 


compared with criterion data covering sev 


eral years’ duration. The designation “con 
current” refers to the data from other spx 
cific tests against which the TAT is 


matched. Some indication of the “content 
validity” of the test is also given through 
the item of 


matehed 


listing ariables 
the 


process of TAT interpretation, 1 


personality 


in research. Implicit the 


} 


nere is also 


some allusion to the validation of 


onstructs 


the constructs outlined by Henry (1956 
as those underlying TAT responses 
The Sample 

The Rin) adolesi ents were se le cte tro 
larger group of 108 cases previously studied 
ina typical midwestern community \ll 


These cases were studied as part of 
litv undertaker 


adolescent character and personal 


the Committee on Human Developr ent of the | 
versity of Uhicago and supporte 
the Lilly Endowment 


JANE FARLEY 


AND 


the 108 subjects (Ss) were studied over a 
period of years, but the records vary in 


lhese 


ost complete case Tec 


comprehensiveness 36 were selected 


as those having the n 


ords and all of the destred criterion data 


It is notable and germane to the study 


that this population represents an unusual 


degree of homogeneity, comprising normal 
idolescents from the same grade level in a 
single school, all of whom were born within 
the same year. They lived in a single small 


community, were participants in overlap 
ping social groups, were subject to similar 
local events, and occupied generally the 


Same socroeconomu level. 


if Criterion Data 


lhere exists a generally accepted progres 


sion from in the 


avowed purposes of various personality study 
lhe paper-and 
pencil tests describe more specitic aspects ot 


behavior ; 


overtness to covertness 


techniques. “objective” 


the observational and 
their 


psve hologi al 


interview 


methods, with increased dependence 


upon inference, often 
of the 


and the projective techniques are 


are 


thought to tap motivational 


some 
patterns 


thought by some to retlect more basic struc 


tural and dynamic layers of personality 
Che TAT itself does not hit into this clinical 
division of labor in any clear-cut fashion 
It is sometimes seen as unable t predict 


overt personality or life history material and 
it is sometimes thought particularly useful 
in this regard. The Rorschach has been said 
to deal with “deeper 
than the TAT Sometimes it 


po ted that the | \'] retlects 


layers of personality 
has been re 


“content” as 


pects of personality and the Rorschach 
st tural” elements. We can hardly hop: 
answer conclusiv the question of what 
he PAT does and does not measure, but 


generally mech and varied data should 


some hght on these issues. 


) several features inf 


quently found in validity 


studies: each in 


in We idd ssing h analvses par 

ir areas of function, and vet each did 

lepender every other parti par 
Phe ] ] 


losely comparable 


x 
Logic of Choice. 
Ve t} ] th 
: 


VALIDITY O} 


statements about the same set of 


persons 
We should thus be able, 


by a procedure of 
comparing these statements, to see how the 
rAl with the 
overt instruments, how they agree 
NOTE’ 


interpretanions agree more 


with the 


subjective observation 


reports trom 


and interview, and how they agree with the 
estimates of the other projective instrument 
With that in mind, we have selected as out 
criterion data representatives of these thre« 


veneral le els ot the overt—covert con 
\t the overt level, the Sociogram and 


fsuess Who test (Havighurst & Taba 
1949), a Family Relations Questionnaire 


Brown, Morrison, & 
Binet, 


and the 


1947). the 
Achievement 


Couch, 
Standard 
emotional Response test 
Havighurst & Neugarten, 1955 

\t the 


Stanford 


tests, 


subjectia observation and 


feacel, 


interviews with the S, teachers. and parents. 
\t the profective lez the 


Rorschach 


lo compare these various levels of per 


sonality expression with the TAT predic 


tions im each personality area, we have used 
a design based on a research done by Cron 


} 


bach in hach (Cror 


validating the 


1948 ) Not 


mut also many of. the 


ach only the general model 
modifications of that 


lesign 


appearing in this study were sug 
gested or implied paper 


ronbac h’s 


Cronbach was seeking a validational method 

h permitted a con rison of the 11) 
broken whole of the projective prediction 
with similarly complex interrelated 


data and which did not permit judge bias 


accepted the “blind matching” method 


s Lenerauyv satistving the requirements 


but also deplored some of its limitations 


The first of these limitations is judgment 
by elimination (if four sets of statements 


must be paired, the matching of the fourth 


malt has little to do with the actual con 
nity of the statements Secondly, as 
Cronbach (1948, p. 367) points out, “If at 


analysis is partly right and partly wrong, it 
matched with tl 


ction will seem to be 


may be 
1; 


pred 


and the 


criterion, 
valid ; the method 


does not show the degree of 


tness ¢ 


rHEMATI 


APPERCEPTION 


each prediction, nor the aspects of the pre 
invalid 


solution was to uss 


diction that are Cronbach's 


the essential method of 
blind matching, but to force judges to make 
independent pairings (thereby eliminating 
judgment by elimination) and to present for 
matching discrete parts of the prediction, so 


that only those statements which were cot 


uld orrectly paired with the 

erion 
Since is Stal this design s based on 
the original Cronbac study, and is more 
tricate n laps most easily cle seribed 
by referen the methods of the earhe 


research. Cronbach's method, very briefly, 


= 
is 
1. Clinical predictions (analyses 
\ crite Wits reate r ¢ Son 
‘ | t 1 el t wivisor \ 
the Ss wer wided in roups three 
triads Phe for each triad included the 
Aiethoa ective pre t re riter Paragrapis 
i I ree T t 
| | | 
tr permit 
ent riter} eit or 
were t redict tatements to 
sure rre tion (the phi coefficient 
" we of t ous data) tea ssess the 
| ) 
‘ 
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In terms of actual scores, Cronbach’s re- 
sults were somewhat disappointing, but 
these low seores seemed to bear out his con 
tention that many of the previously reported 
high validity measures, based on_ blind 
matching, were misleadingly high. Conse 
quently, he felt that the method, which he 
admitted was “laborious,” had promise in 
presenting a measure of congruity which 
was not falsely inflated by spurious partial 
matching. He dissatisfied 
with many of the aspects of his method, 


was, however, 
and pointed out that the reported research 
was a trial run of a new technique, leaving 
much to be desired as a finished design. 
Part of the burden of the seemingly poor 
conformity between prediction and criterion 
lay with the shortcomings of the criterion 
used. The descriptive paragraphs were un 
systematic, and thus varied greatly in com 
prehensiveness and depth. In addition, they 
were based on one individual’s viewpoint 
only—a_ fact could introduce the 
personal bias factor that Cronbach's design 
Itemization 


which 


had so carefully factored out. 
of the projective prediction into separat 
statements was also, according to Cronbach 
(1948, p. 368), “a definite limitation of th 
proposed validation method, since an assess 
ment portrait is more than the sum of its 
statements. 

Finally, in searching out the weaknesses 
in the design, as applied, he avowed himselt 
caught between the need for large numbers 
of judges (necessary for replication of each 
statement criterion matching) and the need 
for “hand picking” expert judges. His solu 
tion was to forego the hand picking, in 
order to secure the required number ot 
judges. The design of the current study ts 
and modified version of. the 


an extended 


study just deseribed lor each of the (12) 
triads of Ss, there are not one, but three 
criteria against which the TAT prediction 
is to be matched. The number of statement 
sheets per triad then becomes nines instead 
of three. This greatly complicates the prob 
lems of constructing composite sheets and 
of arranging presentation orders of the data 
to the judges. the 
plexity and number of the judge decisions 


However, very com 


and JANE FARLEY 


required is intended to resolve, for this 
study, some of the problems which beset 
Cronbach in his relatively simpler design. 


DETAILS OF PROCEDURE IN THE 


PRESENT STUDY 
l. Our first major change m the research design 
lescribed above was the decision to “fragment” the 
riteria rather than the TAT prediction. Cron 


bach’s own dissatisfaction with an itemized projec 


tive report, plus the fact that our more structured 
riterion material permitted a fairly natural state 
ment breakdown, were the deciding factors in this 

1odification 

2. Systematizcation f the criterton dat fr 
paratory to ttemizing. As noted earlier, all of the 
lata collected in the project from whi thi 
study draws its material were addressed to a com 


personality function. This commot 


non outline of 
two broad areas of function f 
Soctal Interact 


sutline proposes 
lit 
e individual personality 
luding vnamics) 


nd 


unctioning, 


peer relationships and family 


Characteristics of the Self Cnecluding mental 


emotional reactivity and adjust: 


d defense mechanisms). Each of the investi 
S participating in the project had agreed | 
ehand nthe us t the researc] Thus, « 
nied to fill as mpletely as his instrument 
vould allow the subheadings under these two gen 
eral sectors of personality. The resulting material 
ill sources was almost ideally parallel in 
ontent. This fact made it possible for us to sub 
divide the personality outline still further, into 
7 Statement items,” without sacrificing (in most 
ses) thre verage of all our data—prediction and 
a eas lividing criterion data 
t te ents was to provice state 
ent sheet for each S such that the mean- 
ys of ear number would he directly com 
parable ill sheets, 1.e., Statement 1 would alwavs 
represent “degree of participation in peer group, 
vel t ugi litferent Ss would provide ditterent 
specific data (“active participant,” “isolate,” etc.) ; 
similarly, Statement 16- would alwavs refer 1 
basic ¢ tional tituce whether the part 
bout an S were “active,” or “passive 
3 in riter \t this point, we 
ted for each S: a TAHT an s (or “pre 
| DT ) wl Ww ( “ used is a total nte 
ited picture, and also 3 sets of parallel criterion 
ta (objective, subjective, and projective), whicl 
were » I educe t statement torm. ©)nce we 
ul decided € ontents of the 27 statement 
our procedure for itemizing the criterion dat or 
Car of the 36 Ss w s tollows 
a Objective ta haper-and-penci! test 
These data almost itemized themselves. Althougl 


i 

: 

Ser Appet A 


the material for different Ss varied in quality and 
amount, statements were very much to the point, 
and little choice of material was involved in the 
itemizing. Each particular test in this objective 
battery provided statement material for its own 
area of specialization, and there was no overlap 
Peer group statements came from 
analyses, the family area was covered by the 
Family Relations Questionnaire, standard 1Q and 
Achievement tests filled in the Mental Functioning 
Area, and the Emotional Response test was our 
objective assessment of Emotional Adjustment 


the sociometric 


b. Subjective data—observations and inter 
views: These were the most difficult and time 
consuming statement breakdowns to make. There 


were varying amounts of data for the Ss, covering 
an average period of two or three years of study 
They were also much less systematically re 
than the other criterion materials, 
were conducted permissively, 


CC 
since interviews 


and the observations 


were recorded verbatim without a fixed advance 
outline. The initial selection of the 27 statement 
items from this source was done by the second 


author, before looking at any of the other criterion 
data or at the TAT summaries (in order to mim 
mize biased sampling). Since 
vation-interview 
in length, any 

was 


many of the obser 
40 or 50 pages 
sampling to cover statement areas 
bound to be highly 
each full set of 
each S, an independent editor‘ 
against the item samples chosen. This editor 
not a psychologist, and it was not his 
comment on the 


summaries ran to 
selective. Consequently, 
observations and interviews, for 
was rechex ked by 
was 
function to 
relevance of particular material 
statement it purported 
instructed merely to see that the 
properly representative of the total 
tion and interview material on any — cul ir state 
ment point. If the investigator’s orig ] 
Item 16 (activity passivity) had 
statement interviews 


chosen for the 
He was 
was 


repres¢ nt 
sample 


been a 


from the with mother, sucl 


as, “is energetic and full of bounce,” the editor 
reviewing the data would not necessarily add any 
corroherative opinions ot neighbors, of the 
self. or of others. If, howéver, the observer (or 


teacher) had produced the information that “sub 


ject is often sluggish and inactive,” this cor 
added to the 


represent the 


ment 
would be 
properly 

set of data. Thus, the 


mother’s statement, to 
“balance” ot the mplete 


editor cut out repetitious 


material chosen for the same item and added mate 


rial which modified, elaborated, or contradicted the 
selected material 

c. Projective data—the 
§’s Rorschach summary was also broken into 27 
reflecting the statement areas of our 
Since the investigator had already worked 
with the observation and interview material for the 


Ss, this itemization 


Rorschach test: Eacl 
? 
statements, 
outline 


was undertaken by an_ inde 


*Mr. Robert Hendrickson, 
for Today's Health magazine. 


Production Manager 


VALIDITY OF THEMATIC 


objective; 4-6, projective 


of Americ: 


\PPERCEPTION TEST 


pendent clinician® who used the outline of state- 
ment contents. By “independent” we mean, that he 
had not seen the data of the study. Where occa- 
sional Rorschach summaries did not cover a par- 
ticular statement filled, 
possible, by “sequence 


ysis” 


area, the gaps when 


anal 


were 
trom the 
of the same Rorschach protocol 

into triads \s 
earlier, 36 Ss were used (chosen from ar 
population of 108, tested in the 
ility 


were 


Statements 


4. Separating subjects 


noted 
original 
adolescent person 
project) The 36 Ss were those whose data 
“richest,” ie. for whom all the 
data wailable. Tl 


the 18 females with the 


selected 
18 males and 
fruitful 
randomly 
issigned their 
triads. The 


18 females 


criterion were 


judged most 


were 


data 


were chosen. First, the 18 males 
assigned to 6 triads, and 


positions 


randomly 
B, or C) in the 


procedure was followed for t! 


same 


5. Making up the final composite test sheets. We 


had now, for each individual, a TAT summary, 
ind three itemized statement sheets ne objective, 
one subjective, and one projective. Thus, for each 
triad, there were three TAT summaries and nine 
tatement sheets. The next task was to combine 


he items from individual statement 


statement sheets, wl ich 


sheets into 


omposite contained mate 


tals relating to ¢ of the individuals 1n the triad 

Starting with Triad 1, we began by combining 
the three “objective statement sheets (for Indi 
viduals A, B, and C) into composite Test Sheets 
1, 2, and 3. To dot! we assigned 1 hers fron 
1 to 6 to each of the six permutations ot the ABC 


sequence (BAC, ABC, CBA, CAB, ACB, BCA) 
We then cons 


table of random numbers for 
random orders f the 15 die its The three com 
posite objective statement sheets were made up 


rmula, That 1s, 1f our first 
were 1, 3, 


niree random listin rs 
sponded to the 
ind ACB, 
would be put into tlie 
order BAC, Item 2 in the 
n the order ACB 
signed to each composite test sheet, it would mean 
that Test Sheet 1 would have B's first statement, 
C’s second statement, and A's third statement. Test 
2 would have A’s Item 1, B's Statement 2, 
ind C's Test Sheet would have 
Statements 1, 2, and 3 supplied by C, A, and B 


respectivels 


and 5, this corre 
| permutations BAC, CBA, 
This meant that Item 1 
composite sheets in 

der CBA, and Item 3 


In terms of the statements as 


numbere 


respec tively 


Sheet 


Statement 3 


This procedure w carried out for all of the 

»7 or ¢ e tvp riterio1 
or each of 

6. Assembling test materials for presentation to 

ud qi For each triad, we now had 3 TAT sum 

naries and 9 posite, randomly “scrambled” 


iterion statement sheet (Statement Sheets 1-3, 
nd 7-9, subjective) 


Klinger h Associate, 
in Medical Colleges, and 
ntern at the Hines VA H 


\ssociation 
a psychological 


spital, Chicago, Illinois 


% 
; ; 
| 
ot 
3 
| 
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To prepare this material for judging, we assemble efi of lateness in the series might stem from 
the statement sheets and TAT predictions into sequet of criteria judged (one criterion 
booklets. The introductory pages of each booklet a might sharpen a judge’s perception of 
were devoted to the entire TAT summary (pre r criterion information), from a_ particular 


diction). The remainder of the booklet was di of presentation of the individuals in a 
vided into sections representing the subheadings riad 
Peer Area, Family Dynamics, Mental Functioning coris ictions were given to judges in 
and Emotional Adjustment. The material was ar standardization. Judges’ ratings 
ranged so that criterion statements tor each area we! de « separate scoring forms.* 
were faced by the corresponding TAT material 
These booklets were given ‘to the judges, in a pre 
arranged sequence (see below), together with i ‘ALYsIs OF Data 
tructions and n outline « the 2/ statemer 
aaa [wo general types of information were 
Wise 1) sought: first, a measure of agreement be 
experienced clinicians and projective analy tween the TAT and the criterion data and, 
chosen as judges. This is a considerably ier second, an evaluation of the variation in 
number than the agreement owing to different matching 
he operations, including some indication of the 
each judge uJ 1 cor , mayor sources of such variation. For these 
| measurement, we have under 
two relatively independent analyses. 
\greement” refers, in this study, to the 
ot accuracy with which statements 
terion sheets are assigned to the indi 
lal trom whose data they were actually 
taken, guided by the TAT prediction. The 
\greement measurement is. given by a 
method suggested for this study by David 
Wallace, of the Department of Statistics, 
University of Chicago. The Wallace index 


erves a purpose analogous to the chi square 


re in the earlher Cronbach study, that is, 


redicho iteme it pr a measure of the degree to 
judges’ correct ratings exceed the 
t ratings expectable by 


orders were des 


ree judg dge “trio”) receivi his analysis does not rely, as does the 
the patterns.” m the assumption of independ 
1 single matching. The only in 
suven of cack depend which must be assumed is that 

e hand picking of judges, vi betw blocks of scores. [ven so, it is an 
correlation between d ons ot the thr ud assumpt we cannot totally fulfill. We 
study dequate incorporate d into the research at each 


possible control against such 
race 2 de \ begin with, the data item 

PAI 
st sheet couaily often. a1 : r this 1 rch is the generally much 


ing first, last, and intermediat mor frractionable criterion information. 

il matchings were diff tatements are less dependent 

judge trio. In this wav, we to av hon tat 

al wouk vw St 

ind/or to detect practice effects which might at Ate 

lecisions made later in the series of matchings mi aken from a projective summary 
vddition facilitation wit! ractice, ot! n We cannot assume that each st: 


6 
tions. Ina pilot study on only one triad, 
“practice” effect from one 
t nil the ibsequent task 
lor the mine judges in t : 
he presentation of 
were balanced, so that 
compared vith t 
Appendix A, Table See Appendix A, ire | 


ment about the S is totally independent of 
every other. Secondly, randomization of the 
statement items on the composite test sheets, 
together with balanced presentation order 
of the material to the judges, renders it 
difficult, if not altogether impossible, for a 
judge to make any decisions by elimination. 
Finally, the heterogeneity of the usual ran- 
dom sample of normal Ss is decreased in 
our extraordinarily similar population, This 
fact implies a decrease in strong contrast 
between Ss, and a consequent increase in 
the applicability of all statements to all Ss. 

Despite these efforts at control, it is still 
possible that a judge's response to one item 
matching might limit the decisions he could 
reasonably make on subsequent items. But 
such qualifications apply to most statistical 
analyses based on sampling techniques when 
these methods are used to evaluate organ 
ismic, psychological variables. 

Our agreement statistic thus shares some 
of the limitations of the chi square, but has 
several advantages over that statistic; it Is 
parametric and, therefore, uses the available 
data more efficiently. Wallace's index also 
enables us to move close to Cronbach’s goal 
of assessing the degree of agreemeni, by 
giving a value of zero to chance matching 
and a value of +1 for perfect matching. 
Although the exact nature of the intervals 
on our agreement scale is unknown and the 
scale is unstandardized, we do know the 
lower and upper limits of the scale. If we 


cannot say for certain how 


“goo a score 

“better” 
than a score of .20 ard “less good” than one 
ot 40. 


of, e.g., .30 is, we can say that it 1s 


In our measure of agreement, two types 
of error are distinguished and computed 
separately : 

1. Errors of discrimination: Because of 
the homogeneity of the population, there 1s 
a large (and uncontrollable) source of sta 
tistical ‘error’ in the fact that statements 
applying to one individual in the group may 
well apply to at least one other in the same 
triad. There are many cases of actual re- 
semblance between triad members, so great* 


‘Examples of various degrees ot S contrast 
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that a technical “mismatching” of this kind 
(false positive) need not argue against the 
accuracy of the TAT analysis. 

2. Errors of identification: On the other 
hand, if a judge calls “unlike” a statement 
which does in fact pertain to the individual 
whose TAT prediction is presented, such 
errors (false negative ) 
refutation, 


constitute genuine 


Each of these errors has, we feel, a dit 
ferent significance in answering the ques 
tions addressed by the study. Conseg uc ntly, 


there are two sets of 


accuracy measure- 


ments (LD scores and I scores), based on 
these two kinds of mismatching. The design 
of the experiment was constructed to sup 
port an analysis of variance. The design 
properties have been utilized to yield this 
more refined assessment of the ditferential 
effects of different conditions of matching 
judge ditferences, 


criterion ditterences, 


practice effects, subject differences, difter 
presentation order, ete) We can 
also examine 


ences of 
some of the possible inter 
action effects: judge/criterion, judge/ area, 
ete 


Because of the systematic outline of both 
PAT prediction and eriterion data, we can 
also separate the matchings into areas, look 


ing at, \rea state 


ments as a group or all Family Dynamics 


for example, all Peer 


statements. It is possible to see in which 


areas the . \ l appears to ofter the most 


vccurate analysis and prediction, in which, 


the least. Here, too, I 


interactions may 


most important; in parti ular, we would 


expect to find some Area-Instrument (crite 


rion type) interaction, 1.¢., such questions 


of overt behavior as “degree of participa 


tion in peer activities” or “functioning 1Q” 


are probably more accurately answered by 
the objective " eriterl han by either obser 
vation and interview or Rorschach data. 


Such directly inferrable, atmospheric items 


as “general family atmosphere” or “rela 


tionship between mother and father” should 


probably be referred most confidently to the 


ybservations and mMterview reports ot 


| 
trained investigators who actually witnessed 
he family in action. In the same way, there 


no instrument criterion battery 


af 
ge 
4 
: 
within triads are given in Appendix A IS 


which can solve the problems of “anxiety”’ 
or “level and degree of control’ as author- 
itatively as can the Rorschach. 

Thus, if the TAT disagreed with the 
Korschach about IQ estimate, but agreed 
with the standard intelligence test, we would 
be inclined to weigh the agreement more 
heavily than the disagreement. If, at the 
same time, the TAT agreed with the obser- 
vation and interview data on family dy- 
namics, but disagreed v ith the Family Rela- 
tions Questionnaire on the same area, again, 
we could feel fairly confident of the accu 
racy of the TAT prediction. Finally, if the 
TAT predictions of intrapersonal dynamics 
were confirmed by the Rorschach, but at 
odds with the 
servation—interview 


same material from the ob 
would be 
inclined to take all of these predictive suc 


reports, we 
cesses more seriously than the failures 


HyporHesks, AND EVIDENCI 
ACCEPTED AS CONFIRMATORY 


Having now discussed the properties of 
the criterion data, the process of comparing 
criteria with TAT predictions, and the gen 
eral nature of the proposed statistical anal 
ysis, We can more meaningfully state and 
elucidate the hypotheses underlying our ap 
proach. They are as follows: 

1. The conclusions reached by the interpreter in 
the analysis of a (TAT) responses are 
“valid.” As confirmation of the validity of the 
PAT, we will accept measurements of 
with the criterion which are 
than chance at the .01 confidence level or 
(The choice of this level of confidence is 
necessitated by 


subject's 


agreement 
better 


better 


significantly 


we teel 
the large number of decisions 

volved, which can inflate moderate departures froy 
Where there are differ 
agreement with the TA‘ 
prediction among the several criterion instruments 
we take as our standard the criteria cited above 


These are particularly qualified to assess each 
+} 
tl 


chance to significance. ) 
ences in the amount of 


Various areas, namely 


Standard Criterion 
aAlrea Instrument 
Peer Group and 


Mental Functioning 


Objective Batter 


Observation and 
Interview Data 


Family Dynamics 


Emotional Adjustment Rorschach 


In any cases of divergence among objective, sub- 
jective, and projective criterion sheets, agreement 
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with these criteria in the areas assigned to each 
will constitute acceptable validation of the TAT 
prediction for that area, even if the other two 
criterion instruments disagree. By the same token, 
if the TAT agrees with some other criterion, but 
does not significantly agree with the one specified 
for the area, the TAT prediction will be consid- 
ered to be not confirmed 

2. The interpretations made by the method herein 
mtlined are reliable. In the physical experiment, 
“reliability” is the measurement of reproducibility 

the extent to which a given test procedure and 
instrumentation thing, every 
Establishing reliability is a 
matter of calibrating instruments and checking the 
function of the 


lo not change 


measures the same 


time, tor every S 


error tester, since the tested prop 

erties ¢ 

In psychological testing, the case is somewhat 

lifferent. Objective psychological tests do permit 

a relatively simple operational concept of reli- 

ibility items of such tests are as- 


sumed not to be sequentially dependent, it is pos 


Because the 


sible to compare parts of the same test protocol 
the same thing about the S 
It is possible also to retest, to see if later adminis- 
But at this 
objective testing, problems are raised 
the timing element. If the interval between re 


to see if they say 


trations of the test agree with the first 


pont, even nu 


dministrations is too short, the second test may 
simply be a recall of responses made in the first 
If the interval is too long, significant changes may 


ive taken place, which appear (and should ap 
The standard objective measures 
IQ, take this into account by prescribing 
the amount and That is, 
relationship of subject-to-test, in terms of 


score, is represented, at specified time intervals, by 


pear) in the test 
the direction of change 
the sume 

d ditterences in the actual test performance 


sp { nie 
But the problem is not limited to objectively meas- 


ured psychological functions. A valid test, whether 
objective or otherwise, must take into account both 
the constancy of the subject-test relationship and 
the changes in actual responses. These are the 
famili Seylla and Charybdis of psychological 
testing tability and sensitivity 

With projective testing, the case is even more 
omplex. Since projective tests do not have a true 

iterion of “correctness,” the dimensions of inter- 
pretation are greatly multiplied, or possibly infinite 
In those psychometric tests that ignore patterning 


of responses in arriving at their principal assess 


ment measure, the incorrect responses are, as 


In pro- 
from 
data 


Rosenzweig (1949), says “thrown away.” 
(divergent 


responses are 


yective tests, the “incorrect” 


normal or popul ir) positive 


‘Error” becomes “individuality ‘i 
In projective testing, subject-test relationships 
Furthermore, the dimensions 
interdependent, in the sense that inter 
} 


re multidimensional 
ire highly 


pretation of each one depends on the context of 
all of the others. Prescribing specific response 

anges which will define a constant relationship 
f S to test over time becomes prohibitively 


8 
2 
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Because we are interested in both of the 
reciprocal functions of stability and 
the question of reliability becomes increasingly 
difficult to answer. We ask “How does the sub- 
ject express his constant self in different contexts” 
(of age, situation, etc.), and we ask the same test 
responses to reflect both parts of the answer 


difficult. 
sensitivity, 


There are, in fact, two general notions of reli- 
ability which are often interchangeably in 
psychological testing. The first applies to the con- 
sistency of the test items themselves: the second, 
to the efficiency of the test in registering the con 
sistency of an S from time to time or situa 
tion to situation. The question of item consistency 
(as measured by split half, alternative form, or 
other methods which test homogeneity of test 
parts) is chiefly of importance in psychometric 
testing, in which the treatment of 
tive and evenly weighted requires this concept 
The interpretation of projective responses does not 
depend on such an assumption. Indeed, the estab 
lishing of item correspondence in projective tests 
would destroy one of the major projective test 
variables: sequence—that part of the stimulus 
value of a test item which is due to its placement 
in the series, and its subsequentness to particular 
other items. In order to utilize the information 
given by sequence, the projective tester makes it 
impossible to establish any clear 
“equivalence” in either split half ort 
form tests of item consistency 


used 


trom 


scores as addi- 


the 
equivalent 


measure ot 


It is in the second general type of reliability 
stability across time—that projective 
primarily interested. As noted above, the measur- 
ing of stability in projective test responses is com- 
plicated by the same factors which posed problems 
in the stability of objective test performance 
There is still the possibility, as in objective test 
ing, of contamination in retesting by 
recital of previous responses 


testers are 


a superficial 
Again, if the interval 
between tests is too long, there is the problem of 
registering actual change in the S, 
which may be registered are much more difficult 
to specify and interpret 


and the changes 


There is yet a third complication in considering 
the concept of reliability in projective testing; it is 
the necessity of using test score concepts and 
methods to analyze data which are composed, not 
of test scores, but of interpretive opinions. The 


Rorschach seemingly avoids this difficulty by 
translating many classes of interpretive opinion 
into score form, but it encounters the problen 


face to face, after the scoring is done, in the ques 
tion of score interpretation However, researchers 
have found some compensation for the additional 
complexity of the interpretation; the recording of 
interpretive judgments at last gives the researcher 
some data which can be quantified and compared 
If several judges interpret the same test protocol 
in the same way, then both test and interpretive 
method are likely to be “reliable.” Some clinicians 
believe that the only meaningful reliability meas 


urement is interinterpreter agreement. Kogan and 
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Hunt (1950), in discussing the reliability problem, 
reason that the agreement of judges corresponds 
to the agreement of scores. They reach this con 
‘lusion because the judge's anzlysis is the raw data 
(corresponding to the data of test scores). Using 
variable patient) in 
and a interpretive convention 
(numerical rating of —2 to +4), they use several 


statistical probabilities of 


a single 


asework 


(movement of the 


simple 


analyses t demonstrate 


rom .061 to .05 for the chance occurrence of the 
measured agreement between their judges 

TI ere ire sever yor obstacles to the inter- 
judge agreement ept ot réliability. The first is 
the inability to assess particular segments of the 


entire stimulus-response-interpreter-analytic method 
sequence. If the judges do these 
likely to be they do 
not, the meaning is not so clear. The lack of con- 


tormity 


agree, ill of 


processes are reliable, but 1 


may result from sem ce be 


ntic diverge 


tween the judges lerstanding of the terms used 


trom divergent or inadequate interpretive meth 


ods. One way around this is to measure the “self 

congruity” otf a single judge, as suggested by 

Combs (1947). However, as discussed above, the 


relation to 
inter 
retest, 1s 


intervals witl 
the judges’ 
\ rejudgment, as well as a 
the influence of recall of 
Similarly, if the interval is 


himself may have changed, or at 


litticulties in setting time 
responses 
pretations 

subject to 


now contaminates 


previous 
judgment 


the judge 


too long, 
least 
his interpretive 


nterj udge 


changed methods In anv ¢ ise, the 
agreement measure is mucl 
“item reliability” than it is to the 
ept whi 


testing 


nearer to 
“stability” 


pertains to 


con 


h morte properly projective 


Che stability of projective prediction is possi- 
ble through confirmation of future predictions 
made from the test analysis or through recogniza- 
bility of the S by his test interpretation, at 


points in time previous or subsequent to the time 
of testing. Bot these tactors are built into 
our study. Beea of the longitudinal design of 
the original project terpreters everal of the 


utilized the 


trom the 


tests 


portunity to make predictions 
test material, to be confirmed or refuted 
by subsequent events. The TAT is one of the 
‘ontaining a ber of developmental predictions 
tor most Ss 


Furthermore, the TAT represents a 


tests 
nun 


single point in the span of a study, 


luring a period of great change in the lives of the 


several-vear 


s. It is compared with data whicl ver the en 


re span of the researcl This ipplies particular}, 


» the observation and interview material which 
was continuous throughout the duration of the 
project, but it 1s also applicable to much of the 

ther data. Sociometric tests and several of the 
other objective criteria were readministered at least 


a third 
readminis- 


once, atter a sever il-vear interv il In over 


of these the Rorschacl was lso 


tered. Consequently, we will accept the reliability 
(stability) of the test predictions as confirmed if 
the | and D scores are correctly judged at the 01 
level of Significance, or better 


a 
* 
. 
P 
= Ei, 
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It is perfectly reasonable to question the useful- 
ness of any reliability assessment, in experiments 
resulting in high validity measures, since the func- 
tion of reliability is to permit valid measurement. 
‘In the strictest sense, a reliability test is only 
necessary if we wish to know how much we can 
increase the measured validity, since reliability sets 
limits on how “valid’’## cs: 
be. In this type of validity 


concept of reliability 


mav show itself to 


test, for which the 


is nebulous at best, we can- 


not “think of reliability in the usual sense, as the 
‘sine qua non’ of our acceptance of findings” 
(Combs, 1947, p. 263). But it is our belief that 
the stability factor, at least, is manifestly worth 


knowing, and worth testing, since the clinical use 
of the TAT, like any 
based in part on the 


other diagnosti 


test 1s 
knowledge of its ability. t 
tap enduring characteristics of the 

3. The wmferences 
from TAT protocols alone are uniqu mn 
vidual, independent of 


personality 


drawn about a given person 


circumstantial factor 
a. They typify an individual as an integr 
person, who is recognizable from age 
from situation to situation 

and consistent ind 


Recognition of Ss as whole 


viduals will be considered to be confirmed by 
greater than chance agreement of TAT predictions 
and criteria, based on the “Identification” score 
and significant at or above the 01 level 
b. Inferences from the TAT represent the 

person at a deeper level than the socially dete 
mined group responses of his mileu 

We will accept as confirmation of this unique 
ness factor, a measure of agreement at or above 


the Ol significance level, based on the “Discrimina 


tion” scores. Considering the homogeneity of the 


population, the ability to separate the individual 
members of the triad, by TAT prediction, bespeaks 
the TAT’s ability to represent the uniqueness of an 
individual, as distinct from his social group 

underlying dynantics, sa 
historical facts, and probable direction of 
of an individual can be inferred dependa 


the TAT 


Current behavior, tent 
change 


hly from 


Of this hypothesis, we can expect only 


partial correlation. No one really expects on 
clinical technique to take the place ot a whole 
batterv of others. There mav be, for exampl 
areas, or at least items, in which the TAT tap 
underlving motives only, and is thus “imaccurate’ 


in predicting behavior, objec 
tive tests in the battery 
are parts of the 


which are “too deep,” 


as compared to the 
It mav also be that there 
dynamics of 


inner personality 


as some clinicians have sug 


gested, for the TAT to tap. However, we should 
be able to conclude with some particularity exact! 
where these areas are, and to provide some ex 
planation of why they are. The evidence we will 
accept as confirmatory of the parts of this hypotl 


esis are 


with objective data (current 


a. Agreement 
behavior) 

h. Agreement with projective data (underlying 
dynamics) 
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c. Agreement with subjective data (salient his- 


torical facts) 
d. Accuracy over the time span of the study 
(probable direction of change) 


It will be noted that the analysis of these data 
is set up according to a “null hypothesis” philos- 
ophy of significance. This is not, to be sure, the 
most powerful use of experimental design or sta- 
tistical method. But the alternative to this usage 
is the use of experimental and statistical techniques 
to estimate parameters, that is, to 
confidence which an experimenter 
prediction of outcome which has already made 
This kind of “validation” is appropriate tor spe 
fields, or whole scientific terrains, whicl 


faye } 


ed ta enough to 


measure the 
may have in a 


have 


generate such verifiable 


estimates. Projective testing, as a science, is new 


enou the less rigorous, systematic, ex 
ploration kind research based on a “null hypoth 
esis” concept is still the order of the day. It is our 
ope it this study ma elp in the formulation 
f definitive theoretical statements, which may 
ghtly be tested by the conhdence interval method 
Indee it is possible that subsequent research on 
the n lata 1 \ adopt such a form, once our 
ex] i re ts are available tor direction 


\NALYSIS OF RESULTS 
Statistical Procedures 
The raw ta were in the form of judges’ rat 
ings, recorded on the rating sheets which were 
supplied to the judges, along with the comparison 
iter lo compute the basic scores used in the 
inalysis, these judge ratings were checked against 


e correct ratings; each of the 26,244 judgments 


was the clerically coded into one of 7 possible 
categories 
1. | (Correct rating Like; judge rated 
ae (Correct rating Like; judge rated 


judge rated 


3. U+ (Correct rating Unlike ; 


4 | Correct rating Unlike; judge rated 

| Correct rating Like judge rated 
6. | (( ect rating Unhke; judge rated 
7 (N riterion information given; judge 
ite ) 


\pproxima 86% of these judgments were coded 


nto the first four categories, and approximately 
last three 
e judgments rated “?” 


00% were in Categorv 7 


14° into the However, of 


by the judges, more than 


categories 


items for which no in 


gory 7, 
formation was available in the criterion sheets 
Inspection of the remaining 357 “?” items showed 
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no rating trends on the part of judges, but items 
were scattered through the rating sheets in seem 
ingly random fashion. Thus, no formal analysis 
was made of the doubtful items.” The judgments 


statistically analyzed were those in Categories 1 
through 4: L+, L-, U+, and U Categories 1 
and 4 (L+, U-—) are correct judgments, of the 
Identification and Discrimination types, respec 
tively. Categories 2 and 3 (L—-, U+) are respec 
tively Errors of Identification and Errors of Di 
crimination. Following this coding operation, the 
data were sorted into various “blocks” of judg 
ments, on the basis of the several test vartables 
Block Scores were computed 
1. For the permutations of 36 Ss 
2. For 12 triads 
3. For the four behavior areas 
4. For the three test-sheet types (objective, 
subjective, projective ) or “instruments” 
5. For the three judge trios (i.e., for each set 
of judges who performed identical tasks) 
6. These data were analyzed in such a way 
that their block scores could be regrouped and 
analyzed also for each of the three task orders 
(first 4 of total judgments, second ‘4, and third 
4) for all nine judges 
7. A preliminary exploration was made of thie 
effect of the “sequence” of instruments 
which the TAT prediction was matched (1e., 
judge matches objective first, then subjective, 
then projective, or any other of the possible 
orders). It was felt that, should “sequence” ap 
pear to have a significant influence nm judges’ 
decisions, it could be factored, balanced and 
analyzed 

8. The data were also sorted and analyzed for 
those of the interaction effects which were trans 
latable into meaningful psychological variables 


The statistical analyses were performed on the 
scores of these several “blocks.” Each “block 
score,” based on either the Identification or Dts 
crimination decisions (I scores or 1) scores), rep 
resents the percentage of correct ratings im excess 
of the percentage which would be expectal 
chance alone. The rationale tor the formula 
in arriving at these scores is as follows 


1. We have separated the Like and Unlike state 
ments, since we felt that different problems wer: 
involved in both correct and incorrect ratings ot 
the two. The I scores (both correct and erroneous) 
are concerned with Like statements (those for 
which “+” is the correct answer), and the D 
scores with Unlike statements (those properly 
scored “—”), 

2. The Null Hypothesis would state 
assigned by judges are randomly distributed, and 
have only a chance relationship wth the Like 


items (Table 7) will be found in Appendix B 
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wreater proports 


hypothesis of chance distribution is: / 


CAT predictions were matched, it 


® A summary table of the “No Information” 


\Wwe can express the pro 


t 


Ss as i tic 
lll 

itically related to the Like 

expectation ot correct judg 

nents would be no greater than 


ade on all items, 1e., no 
s will be assigned by 


statements than to the same 


chosen at random from the 
e¢ to which the actual correct 


hance value would be 
ance equal to zero By the 


rmula tor arriving at the 


rating ibove ance, in the 

tact that chance alone would 

ese scores, it 1s possible to 


a confidence level for the significance of the 


values of the scores The 
Wallace, for determining the 


udgments, based on the null 


N(Xe)?, 
the grand mean, and S?7M the mean 


the mean square ot triads, 


interaction. This denominator 


e more conventional resi 


ler 1 minimize the inter 


separate judgments which con 


particular, this formula takes 
tematic effects of the judges’ 


us tormula 1 


¢ e of the random assign 


ents to the composite test sheets 


1 of a test sheet to contain 
n widely varving pro 


lze to compare the 


le for a ju 


r prediction with a criterion area in which all 


re, e.g., unlike the S, because 


from the criterion data of the 


the triad. A judge making a 
~ judgments in such a case 


would come up with a score of zero 


statements, since both ratios 


However, this is only true in the few 


Since statement assignments 


is assumed to be a random 


1 will not materially affect com- 
parisons between the scores. 


portion ot correct 
are not syst 
Statements, then tf 
ments of Like stater 
the proportion of “+"'s 
chance to the Like 
number of statements eae 
total Phen the degre 
L+ 
percentage ot correct 
1) score would be 
| 
4 
Making use of the 
give a value ot = 
index,!! suggested by 
where NXg : 
square of trios plus 
minus the tno-triad 
was used, rather thar = 
dependence of the 
prise the blocl 
nto a ount at 
value 
Be 
\ 
It must be noted that 
possible for any ar 
statements about t 
portions. It is possih 
other individuals it 
totally correct set 
for that block 
extreme -cases, al 
were random, this 
ue 
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individual personalities, and the nonrelevant differ- 
ential effects of the subject-personalities involved 
in the various triads. The degrees of freedom 
figure for the denominator is found by the 
formula: 

df (of denominator) dfx x«tytrx 


— + — 

df y xtyts df s xtyts 
where + = mean square of trios, y = mean square 
of triads, and z = mean square of the trio-triad 
interaction. 


S*m 


Over-all Significance: F = 


Effects 

Total | 3887 | 4,848,488. 
Between Trio 2 kg: 
Between judge within 

Trio 6 58,074. 
Between Triad | 103,725. 
Between subject within 

Triad 24 118,817 
Between instrument 2 | 2,510 
Behavior area 3 | 135,829 
Order 2 12,283. 


Interactions 


Total | 1,062,667 
Test Sheet-Area 6 10,679. 
Trio—Area 6 16,917. 
Triad—Area 33 | 66,988 
Instrument-—Trio 4 17,880 
Instrument-Triad 22 | 30,480 
Trio-Triad 22 39,592 


Judge (within Trio)- | 


Triad 66 51,081. 
Trio-Triad-Instrument | 44 141,379. 
Trio-Triad—Area 66 151,593. 
Trio—Instrument Area 12 14,966. 
Triad-Instrument Area 66 141,388. 
Trio—Triad-Instrument 

Area 132 430,763. 


Residual | 3358 | 3,322,805. 


TABLE 
I-ScorEs 
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Results of the analyses of variance, and 
the validity index for these scores are sum- 
marized in the two tables below: (The F 
values marked “+” are significant at .05-.01 
level. Those marked “++” are significant 


RESULTS 


at and above the .01 level. 


1 


15,889. 


9,679. 
9,429.! 


Summary of Analysis of Variance 


64.026*(.01 = 21.20 


F 


cant at .O1 


lev el or 
better) 


F (mean 
square of 
effect No.) 


20 


Denomina- | 
(++ signifi- | tors Used for | F Required 
for Signifi- 
cance at 
.O1 Level 


| 


| 


| 


* This F value is given by the denominator outlined on page 32. A less conservative estimate of F, based on the Trio—Triad 
Interaction alone, as a denominator, yields an F value of 836.953. 


ue 


wh 


All others are 
nonsignificant.) Details of computation, and 
block scores for each of the blocks will be 
found in Appendix B. 


| 
| 
| 
as 
| | 
| df ss m 
M000 «8.8294 + | 872 
12.506++ 14 3.12 
45 | 9.152++ | 14 2.82 
4,950.708 | 5.003++ | 20 | 
| 45,276.333 | 19.712++ | 16 13 
: | 6,141.500 | 6.206++ | 20 61 a 
3 a | | | | | 
1,779.815 | 1.798 20 .80 
2,819.449 | 1.228 16 12 
2,029.939 884 16 .60 
$,470.110 | 1.391 15 83 
1,385.449 | 1.409 15 .29 
1,799 .646 2.325++ 14 «42 
| i 
773.954 | .782 22S 
3,213.167 | §.247+4 | 20 .59 
2,296.865 .321++4+ | 20 47 
1,247.175 .269 20 .18 
2,142.250 1644+ + | 20 47 
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TABLE 2 


D-Scores 


Summary of Analysis of Variance 
N(X)* 


Sm? 


Over-all Significance: F = = 98.796" (.01 = 34.12 


(+ = sig- Denomina- 
nif. @ .05-.01 tors Used for | F Required 
++ = sig- F (mean for Signifi- 
nif. @ .O1 square of cance at 
df ss ms or better) effect No.) .01 Level 


Between Trio 2 2,738 1,369 .000 1.479 13 
Between judge within 

Trio 6 22,534 3.755.667 12.471+-4 14 3.42 
Between Triad 11 13,284 1,207 .636 2.971+4 14 2.82 
Between subject within 

Triad 24 22,746 947 .750 3.13444 20 1.79 
Between instrument 2 765 382.500 265 1S 5.18 
Behavior Area 3 10,287 3,429.000 11.341+4 16 4.13 
Order 2 2,962 4. 


Interactions 


Potal 340,979 


Instrument Area 1,393. 232.289 .768 20 2.80 

Trio-Area 6 4,867. 811.302 1.079 16 3.93 
Triad—Area 33 20,285 614.720 .818 16 1.60 
Instrument—Trio 4 1,415. 353.793 245 15 3.83 
Instrument—Triad 22 10,544. 479.317 332 15 2.29 
Trio—Triad 22 20,359. 925.422 3.963 + + 14 2.08 
Judge (within Trio)- 

Triad 66 15,412 233.561 20 1.47 
Trio—Triad-Instrument 44 63,471 1,442.541 $.771+ 20 1.59 
Trio—Triad—Area 66 49,607 751.632 2.486+ 4 20 1.47 
Trio—Instrument Area 12 6,604 550.374 1.820+ 4 20 2.18 
Triad—Instrument Area 66 37,087 561.924 1.858+4 20 1.47 
lrio—Triad—Instrument 

Area 132 125,347 949.554 3.140+ 4 20 1.32 


Residual 1,015,257 302.339 


The trio-triad denominator yields an F for D scores of 749.107 


The interjudge reliability was estimated which the dichotomous values of 0 (nega- 
by selecting pairs of judges from each trio _ tive judgment) and 1 (positive judgment) 
and comparing their performances in terms _ represented each judge decision. Phi values 


of frequency of occurrence in each of the — for the three trios were: For Trio 1, .414; 
4 coded categories (L+, L-, U+, U-), Trio 2, .380; and Trio 3, .459. Although 


by means of 2 X 2 contingency tables and phi tends to yield somewhat lower scores 


the formula than the corresponding rho values, none of 
Py AD 1¢ these pairings can be assumed to correlate 
(A+B)(C+D)(B+D)(At4+C) above the .5 level. 
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WILLIAM 


INTERPRETATION OF RESULTS 


Consequene esfors pecifu Hypotheses 
l. The validity of the TAT. We selected, 


as the evidence for confirming our validity 
hypothesis, | scores which showed a greater 
than chance relationship between TAT and 
criteria, at the .O1 level of confidence. The 
over-all analysis of I’ score blocks shows 
significant agreement well beyond the 
agreed-on level, and we may consider our 
first hypothesis to be confirmed 
no significant variation in the agreement of 
each of the criteria with the TAT Pred 

tion.'? Consequently, it was not necessary 
to base the confirmation of this hypothesis 
upon the somewhat arbitrary selection of a 
“standard” criterion for each area. 

Despite the rather surprising lack of con 
formity among the individual judgments of 
these expert judges, the combination of 
judges’ scores matched the correct “I” rat 
ings at or above the .01 level of confidence. 
We will have more to say about the mean 
ings of these effects below, but it is safe to 
assume that the lack of judge agreement r 
duces the apparent TAT criterion measure 
ment and that the actual agreement between 
our projective prediction and the criteria 
used may be even higher 

2. The reliability of the TAT. As pre 
dicted, the “stability” aspect of reliability was 
the only one on which we are able to reach 
The fact that both | and D 
scores exceeded expectable chance values 
rAT 
predictions have withstood the test of time. 
Che criterion data matched against the TA 


any conclusion. 


for all Ss allows us to say that the 


were based in many cases on several years’ 
study. The TAT interpretation was drawn 
from a single administration at some 
during this interval of study. The agree 
ment of the TAT with these long-term ob 
servations and with the repeated criterion 


t 


testings reflects the value of the test in pre 
dicting future trends and inferring at least 
the recent history of these Ss. 

3. The discrimination power of the TAT 
We find that our hypothesis about the use 


12 See full discussion of this finding below 
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AND 


fulness of 


the TAT in the 


uniqueness of individual personality is con- 


expressing 


firmed by the high LD) scores. The measure 
of conformity on these scores indicates that 
the TAT prediction provides a guide for a 
which he is able to 
recognize a unique individual, even against 
a social background which equates virtually 
every aspect of that individual except his 
ndividuality 


} The of the TAT. 


trained clinician, by 


The instru 
ments which provided our criteria for com 
parisons represent the 
different personality levels. The lack of sig- 
nificant instrument variation in TAT crite 
rion agreement, together with the stability 


were assumed to 


ot prediction discussed above, allow us to 
claim for the TAT, within the limits of our 
population and our analytic tools, the wid 
Which oun 


(pe 


State d 


hypothesis — originally 


all three le vels of behavior, and the 


past and future dimensions of time as well 


is the present, can be dependably inferred 
data alone. However, some of 
the differences which weré anticipated in 


the first formulation of this hypothesis were 


found. The TAT does differ in its predic 


ive efficiency as a function of the area of 
prediction, the individual S, the judge mak 


ing the decisions, etc. Although none of 
these factors “disqualifies” the TAT pre 
liction, all of them provide us with in 
creased understanding of the differential 


sensitivity of the TAT in various contexts 


Discussion of the \Mlajor Findinas 


| The 


criterion 


ance level of the TAT 
in all contexts is cet 
principal 

there are several qualifications which might 


be attached to this finding : 


high sianifi 
agreement 
our However, 


tainly concern 


judgments 
It is clearly 
true that the larger the number of match 


a. The iarge number of 


tabulated in miputing scores 
ings used, the smaller the magnitude of 
consistent agreement necessary for a signifi 

This “inflating” 
property of the large number of matchings 
analyzed is, we feel, more than compensated 
by the difficult task we have set for the 
TAT. The “blind” analysis of the S's re 


cant level of congruity. 


; 
— 
j 


VALIDITY OF 


sponses was undertaken on a group which 
was homogeneous for virtually all variables 
except unique personality. Then this anal 
ysis of a single TAT administration was 
matched against data intensively collected 
over a several-year period. The personality 
information with which TAT predictions 
were compared was comprehensive in level 
and scope and represented a duration of 
time which was—with respect to the single 
AT administration—both past and future, 
as well as concurrent. 
hb The low interjudge reliability: The 

lack of conformity shown between judge’s 
matchings was not anticipated. The finding 
is not unprecedented. Interjudge reliability 
for similar tasks has been reported variously 
as high (Krugman, 1942), very low (Cox 
& Sargent, 1950), and indeterminate 
(Havighurst & Taba, 1949). However, we 
had expected much higher correlation on 
the basis of our judge selection and the re 
sults of the pilot research. Pairings of the 
three judges in this pilot study vielded reli 
ability coefficients of 87, .91, and .94. Two 
of the same judges participated in this 
study, and the other seven were equally 
trained and similarly oriented. The only 
known difference in the selection of judges 
between the pilot study and the larger study 
was that all three judges of the smaller re 
search were women, while the present 
judges are five women and four men. There 
is no obvious relationship between gender 
and interjudge agreement, and since the 
single pair of female judges whose agree 
ment was coniputed produced essentially 
the same level of agreement as the other 
two pairs, it is probably safe to assume that 
the gender of judges was not the determin 
ing factor in producing these unexpected 
results. 

The changes made in the tasks required 
of the judges from the pilot study to the 


_larger research are far more likely to have 


g 
caused the difference in judge agreement 
These changes were of two kinds: a chang: 
in the number of matchings which the 
judges were asked to do and a change in 
the actual material presented —different in 
dividuals, different triads (implying diffet 


THEMATI 
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ent levels of contrast), different criterion 
sheets. Our analysis of variance strength- 
ens the suspicion of the influence of a 
change of task. Variations in the judging 
process due to the effects of different triads, 
different individuals, etc. reflect the changes 
in performance due to these factors. The 
number of tasks seems also to have influ 
enced the quality of the matchings—judged 
by the “fatigue factor” shown in the signifi 
cant “order” ettect (see below). But what- 
ever the reason the low interjudge reli 
ability must affect our confidence in the 
high validity of the TAT predictions. The 
actual congruity between predictions and 
criteria is not called into question, since the 
low judge conformity must reduce, not en 
hance, the apparent validity It would 
therefore produce a low estimate of actual 
validity. However, the divergence between 
judges might well cause us to hesitate in 
asserting the samc degree of predictive eth 
cienev for the TAT when used in the clin 
ical situation. We are forced to say: “The 
validity is there, if the clinician can make 


we of it 

leailw f ff lata to meet the as 
PTLONS a methods We 
have mentioned a single distortion which 


might result from our statistical techniques 

the false “chance” score which could 
sult when the statements in a given block 
are either all “like” or all “unlike” the S 
against whom they are being matched. 
Since the assignment of statements to test 
sheets was random, this effect might lower 
the over-all. validity. estimate, but would 
only lower tt shghtly We would not ex 
pect it to have any systematic distorting 
effect, 14 on any particular blocks of 
scores. Consequently, it would not alter the 
variance analysis 

There is, however, another aspect of our 
techniques which might be more seriously 
questioned ; this is the fact that the F test 


™ Questionnaires were completed by all judges, 
with respect to their work patterns, clinical frame 
of reference, amount and variation of interest in 
the tasks, etc. A later test is planned to investigate 
further the ette t of the e perso al variables on the 
judges periort ince 
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assumes normally distributed, equally vari- 
able, and independent scores. The “inde- 
pendence” we have tried to enforce. While 
no datum concerning a single individual is 
ever totally independent of other datum 
which refers to the same individual, we 
have described the precautions taken (in 
processing criterion information) against 
the giving of artifactual 
might link the separate statements (1.¢ 
idiosyncratic verbal habits of imterviewed 
parents). An attempt was also made in the 
computing of significance of score to con 
trol against the remaining dependency fa 

tors. This was described ‘in the preceding 
section. It is our belief that these stat 

ments are now virtually independent, except 
for the inherent resemblance of tru: 
ments about a single S.''| The assumption 
of normality is obviously not met by the 
distributions of 


“clues,” which 


state 


either | or D scores In 
spection of the frequency histograms of | 
and 1) scores'® reveals their nonnormality 
(the [D distribution is approximately sym 
metrical, but leptocurtic ; 


has 


the I distribution 
a considerable negative skew). How 
‘ever, it has been demonstrated empirically 
that neither nonnormality nor heterogeneity 
of variance—nor a combination of the two 

vitally affects the accuracy of the / ratio 
as a test of the null hypothesis concerning 
the means of two data samples.'® 

2. Differences between I scores and D 
scores. It is of interest to see whether our 
division of the matching tasks into the two 
processes of identification and discrimination 
‘agrees with the results obtained for these two 
sets of scores. Our first observation is that 
both scores were very significantly higher 
than chance. A closer look reveals the fact 
that the actual numerical value of the F ratio 
for over-all significance is greater in the case 


' We are partially corroborated in this belief by 
the fact that / scores can be obtained using the 
residual mean square as a denominator which are 
very similar, in every case, to the values based on 
the special denominators 

'5 See Appendix B 

16 (Norton 1952, p. 714) “Even [in the] case 


with populations differing marked in form and 


variance, only about 10% of the mean square ratios 
exceeded the / 


values by 5 percent-points.” 


ot the D scores than the I scores. This ap- 
pears to be contrary to our original conten- 
tion that the scores should be separately 
computed, because the task of discrimina- 
tion, In a population as homogeneous as this 
one, would be a much more difficult 
than that of identification. 


one 
Errors of the ID 
variety we felt to be easier to make, and in 
a sense justified by the data. The difference 
in values of the / 


ratios seems to say that 
the 


exactly reverse 1s true. 


However, a 
glance at the ss columns of the two summary 
tables offers a clue to the seeming disparity.’ 
[he variation in TAT-criterion agreement 
based on the I score analysis is consistently 
greater than the same measure using the D 
The lower variability of D scores 
reduces the denominator of the F 


SCOTES 
ratio and 
thereby increases its value. The grand mean 
for I scores is, as expected, larger (19.680) 
than the I) grand mean (13.353). 

The lower variability of D scores can be 
at least partially explained by the greater 
number of D made. For each 
TAT prediction, only about 4 of the state 
ments with which it was 
actually Like statements. There is also a 
possibility that some of the difference in 
variability 


decisions 


matched were 


(between | and D decisions) can 
be attributed to the different nature of the 
tasks. The | judgment—deciding whether 
t correspondence does exist between an S's 
TAT prediction and a given statement—was 
expected to be easier and more obvious and 
superficial than the exacting, hair-splitting 


decision that a 


statement does 


match 
homo 
is conceivable that 
the | judgments tended to be more spon 


not 
the prediction (D judgment) in a 
geneous population. It 


taneous, to involve less intense concentra 
tion, and perhaps to reflect more the inter 


personal differences between judges than 


the 1) judgments. The much lower “be 
tween trio” and “between judge ( within 
trio)’’ mean squares on the D scores might 
tend to bear this out. Thus the D scores 
may show lower judge variability because 
the judges, in deliberated decisions, bh: 
haved more like clinicians, while in the 


spontaneous [ judgments, they behaved 
However, the T and 1D 


scores show generally parallel trends, and 


more as persons 


\ 


VALIDITY OF 


the discussion to follow, on the principal 


effects noted in the analysis of variance, 
can thus be applied justifiably to the com- 
bined I and D findings for each variable. 

3. The principal effects noted in the anal- 
ysis of variance. 

a. rea. By far the greatest single 
effect relates to differences in the behavior 
area in which a prediction was made ( Peer 
Group, Family, Mental Functioning, Emo- 
tional Adjustment). This effect is entirely 
in keeping with our expectations. 


The reason commonly given for assess- 
ing personality by a battery of tests rather 
than a single instrument is that the tests 
will be focused on different particular areas 
of personality function. The TAT must 
also have its particular spheres of maximum 
efficiency. Scores in the four areas divide 
themselves neatly into two groups, Area 1 
and Area 4 (Peer Group and Emotional 
\djustment) of virtually equal percentage 
above chance. These are noticeably more 
accurately matched than Areas 2 and 3 
(which have also virtually the same 
score).'? Since there are significant 
effects attributable to either the instrument 
used (test sheet type) or interaction be 
tween area and instrument, we can only dis 


cuss the total area effect, as applying equally 
to all the three criterion types. 

Literally, this tells us that the TAT is a 
better predictive guide to personality func 
tion in the context of peer group relations 
and emotional adjustment (or intrapersonal 
dynamics) than in the family relations o1 
mental functioning contexts. The pairing 


of areas in these two divisions has some 


elements of incongruity. The peer area re 
flects the most overt of. the four areas and, 
also, the most “social” factor. The other 


social context—the family area-—-is consid 


erably tivolved in matters of covert, emo 


tional currents, and much more focused on 


'7 Area scores: Areas 1 and 4 show a percentage 


respectively; in D scores, 15.668 and 14.019, re 


pectively \reas 2 and 3 show corresponding 


11.453 and 12.271. 
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above chance in I scores of 24.849% and 26.266%, 


scores of 13.281 and 14.323, respectively; I) scores, 
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the inner significance, to the S, of the cor- 
responding behavioral events. The emo- 
tional adjustment section, on the other hand, 
is the least overt and “social” of the four 
areas; it is almost entirely occupied with 
the private world of the S, conscious and 
unconscious, and with the relationship of 
the S to himself. It would have seemed 
more directly logical, on the basis of both 
the overt/covert and the social/private di- 
mensions, for the “peer group” and “family 
dynamics” areas to have become paired, 
pairing also mental functioning and emo- 
tional adjustment. The single quality which 
might fit the division made by the scores is 
their placement in time. The peer group 
and emotional adjustment areas are perhaps 
oriented somewhat more to a current point 
of time, while the family dynamics and the 
mental functioning sections could be said to 
represent more enduring functions, with 
historical roots (family) and developmental 
portents (Mental Functioning). This dis- 
tinction is not unequivocal by any means, 
but it is the only tentative explanation 
which suggests itself from the data at hand. 

There are two extraneous factors 
which should be noted as possibly contribut- 
ing to the area effect. One is the special 
interest of the interpreter (Henry) in the 
held of peer interaction, as well as his shar 
Ing of the projective test cupational 
interest”—a particular concern with inter 
personal dynamics. Henry’s (1956) con 
cept of the shaping of personality by the 
reciprocal processes of socialization and in 
dividuation retlects this double interest. The 
second extraneous factor which might con- 
tribute to the observed area effect pertains 
to the Ss of the study. During adolescence, 
a great deal of psychic energy is bound up 
in two processes (a) the establishing of 
relationships with the peer group, and (b) 


the psychic civil wars of emerging sexu 


ality, developing self-concept, and the for- 
malizing of attitudes and values. It is 
possible that the greater accuracy of TAT 
decision in 
these two areas 1s an artifact of the life 


interpretation and/or judge 


stage of these Ss. A greater emphasis on 
and intensity of the Peer group and Emo- 


tional Adjustment issues may have sharp- 
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ened the contrasts between individuals for 
the related statements. 

It should be reiterated that the lower 
efficiency of judgment in the family and 
mental functioning areas is only a relative 
matter. Agreement was considerably greater 
than chance for all four areas. The differ- 
ence exists, and is striking, but does not in- 
validate the predictive effectiveness of the 
TAT in either of the less well-matched 
areas.’* 

b. Triad. We have assumed the effect 
of differences between triads to be a “con- 
trast” function. There were some triads in 
which distinctions between the individuals 
were relatively great and others in, which 
the distinctions were negligible. This func- 
tion would also be reflected in any inter- 
action effect to which the triad differences 
contribute. 

c. Trio. Although it is possible to ana- 
lyze the difference between the three groups 
of judges as a single effect, this variable is 
probably confounded by other variables, 
and for that reason is probably not very 
meaningful. The different trios are sepa- 
rated by two sets of partitions: The differ- 
ent task sequences and orders assigned to 
the three trios and the individual person- 
alities which participated in the judgments. 
Although we have attempted to extricate 
from this “single” factor the proportions 
contributed by its definable parts, we have 
not tried to interpret the trio effect, either 
singly or in interaction, but have preferred 
to concentrate on the factors of judge con 
formity and of presentation effects. 

(1) Judge Conformity. There are 
two measures of judge conformity. In the 
analysis of variance, this factor is repre 
sented by the “judge (within trio)’ main 
effect. Of this effect, we can note merely 
that it is the second largest effect in the 
combined I and 
influence on I 


(its differential 
D scores 1S discussed 


scores 
and 


18 A study by Graham (1947) focused on just 
the mental functioning area of these same data 
confirms the predictive efficiency of the TAT in 
this less efficient 
between blind 


In this study the correlation 
from the TAT 


irea 


IQ estimate and 


Stanford-Binet [Qs was 84 


above). The other measurement of judge 
conformity is the correlation measure, based 
on the similarity of decision on matching 
each individual statement between pairs of 
judges from each of the three trios. These 
phi coefficients for judge agreement on per- 
formance of identical tasks are surprising] 
low. 

The psychological implications of the 
nonuniformity refer principally to the prac- 
tical clinical use of the TAT. The design of 
this test actually approximates the clinical 
situation: the diagnostic test interpretation 
is presented to a clinician who must—on the 
basis of that report—gauge the needs and 
probable reactions of the patient, both as a 
whole and as a unique individual (the iden- 
tification task) and as distinguished from 
all other individuals (the discrimination 
task). The decisions expected of the clini- 
cian are very like those asked of the judges 
in this study. The results of this study pro- 
pose that, although expert clinicians are 
able to make the correct decisions in each 
case, each one will arrive at the goal by a 
slightly different route. 

(2) The presentation effects can be 
considered to be composed of “Sequence” 
and “Order.” Order refers to the number 
of judging tasks performed previous to the 
task in question, and Sequence refers to the 
nature of the previous tasks. 

(a) The Sequence Effect was not 
included in the analysis of variance sum- 
mary (as noted earlier), because of its 

multiple contamination. The 10 sequence 
means are based on different numbers of 
scores, and are confounded by order ef- 
fects, as well as by the effects of all the 
“instrument” interactions. We felt that 
the labor involved in analyzing this effect 
would only be justified by some evidence 
of significant influence. Our first ( rough) 
exploration of this factor revealed no sig 
nificant contribution, and the analysis was 
not carried further. the F 
values obtained for this unfactored effect 
(using the residual mean square as a de- 
nominator) approached significance in 
both | and 1 tables, and some discussion 
of the results is justified, 


However, 


x 


The etfect of sequence was among 
the most powerful in the exploratory 
study for this research. It appeared that 
the discriminations made in the immedi- 
ately preceding task could facilitate or 
impede the subsequent judgment—as if 
the concept suggested in considering one 
criterion either sharpened or clouded the 
perception of the next. It was for this 
reason that the exploration of sequence 
effects was undertaken. In summary,’® 
the results of that exploration (from 
most effective sequence to least) was: 
1. Test preceded by Objective and Subjec- 
tive matching 

2. Test preceded by Objective and Projec- 
tive matching 

3. Test preceded by Projective matching 

4. Test first in series (not preceded by any 
other test) 

5. Test preceded by Subjective matching 


~ 


6. Test preceded by Projective and Subjex 
tive matching 
7. Test preceded by Subjective and Objec 
tive matching 
& Test preceded by Projective and Objex 
tive matching 
9. Test preceded by Objective matching 
10. Test preceded by Subjective and Proje 
tive matching 
The differences in effectiveness of the se 
quences is in only two cases very large, 
the first two scoring considerably higher 
than the remaining eight sequences. Both 
of these sequences specify the Objective 
matching as one of the preceding tasks; 
thus. it is possible that the findings are 
confounded by a tendency of objective 
matchings to be somewhat less accurate 
than the other two (since the test pet 
formance which measures the sequences’ 
effectiveness is the last test in the series) 
However, the lack of significance of the 
instrument effect (test sheet type) pre 
vents us from stating this as a definite 
conclusion. The only conclusion which 
can be reached from these data with rea 
sonable confidence is that there is some 
conflict between any preparatory “sharp 
ening of perception” and what might be 


19 The actual block scores for the various s¢ 


quences will be found in Appendix B, among the 
analysis of variance tabulations 


VALIDITY OF THEMATIC APPERCEPTION TEST 


19 


called a ‘fatigue effect,” since the effec- 
tiveness of “no preceding test’’ is rela- 
tively high in the series. 

(b) Order: The “fatigue effect” 
mentioned above is confirmed by the 
scores resulting from different position in 
the total task. Between the first, second, 
and third sections of the total matchings 
assigned, judges show a_ consistent— 
though not extraordinarily large—decline 
in performance. This applies to both I 
and D scores, and seems to represent a 
simple function of tiring. 

(c) Individual Subject: The dif- 
ferential effect of particular individual Ss 
on the matching performances of judges 
may be enhanced by contrast factors of 
the total triads. However, it is more 
likely to be caused by more personal, sub 
jective (from the standpoint of the 
judges) considerations, such as the 
amount of rapport which a judge could 
establish with a particular S, or the in- 
herent interestingness of that S to that 
judge. The fact that the within-triad 
effect is substantially greater for the I 
score than for the D scores points up this 
line of reasoning. If the effect resulted 
to any considerable extent from degree 
of contrast, the effect should be more 
prominent in the task of discrimination. 

(d) Instrument: One negative 
finding is fully as important as the posi- 
tive results of any other of the tested 
variables; this is the nonsignificance of 
the difference in criterion instrument 
(test sheet type), and of the instrument- 
area interaction. It was confidently hoped 
that the results of this validation study 
would provide some clue to the place of 
the TAT in a clinical test battery for 
Three different 
types of instrument were included among 
the criteria against which the TAT was 


personality assessment. 


matched, and there was reason to believe 
that the TAT would show itself to be 
closer to one than another of these tech 
niques. In stating the evidence acceptable 
for confirmation of the validity hypoth- 


esis, a “standard” was selected from the 


three criteria for each area of prediction. 
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This selection was made in the express 

anticipation of notable “area-instrument”’ 
interaction. The nonsignificance of these 
effects presents three alternative interpre- 
tations : 

(1) There is actually no differ- 
ence between the reports made from the 
objective, subjective, and projective tests. 

(2) The TAT is so flexible and 
comprehensive that it taps all levels of all 
areas equally well. 

(3) The interpreters of each of 
the criterion instruments knew their tests 
so well that none of them ventured be- 
yond the information legitimately pro- 
vided by those instruments. If this is so, 
the criterion information would vary in 
richness and clarity (as the information 
sought is nearer or further from the 
focus of the particular instrument), but 
need not vary in accuracy ; i.ec., the state- 
ments provided by an instrument in a 
“less compatible” behavior area might be 
sparse or vague, but would not be likel) 
to be false statements, because they would 
not be based on overextension of the in 

' strument in question. Thus, judges might 
be required to make decisions on the basis 
of superficial or meager information, or 
might be faced with a “to information” 
item, but would probably not be asked to 
match the TAT with an incorrect state- 
ment. Consequently, assuming a consist- 
ent level of validity for TAT interpreta- 
tion, there might be no significant change 
in error score, but a great deal more un- 
certainty. 

1. The first alternative (equivalence of 
tests) we can discard, since it is contra 
dicted by all of clinical tradition and 
experience. 

2. The second alternative (flexibility 
and comprehensiveness of the TAT) is 
very tempting, and may well be partially 
true. However, the “one perfect” clinical 
test idea does not agree with personality 
theory, with test theory, or with the ex 
perience based opinions of testers. 

3. The third alternative (limitation of 
the scope of interpretation)—or at best 
combination @f the second and third—is 
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probably the truest explanation of the un- 
expected negative finding. We have one 
index to the plausibility of this explana- 
tion. Although the hypothesized “meager- 
ness” and “superficiality” of data are mat- 
ters of opinion, which it would be diffi- 
cult to verify, the “no information” items 
are perfectly definite and verifiable. The 
distribution of no information items ?° 
lends credence to our answer. Using the 
same “standard” criteria for each area 
(which we defined in the validity hypoth- 
esis), we find the “no information” items 
distributed according to this pattern. The 
instrument—area combinations chosen as 
ideal have the smallest number of no in- 
formation entries. The smallest propor- 
tion of “no informations” given for the 
“subjective (observation and interview)” 
criterion was in the Family relations 
area; the smallest proportion of “no in- 
formation” items in the Emotional Ad- 
justment area is from the “Projective 
( Rorschach)” criterion ; the smallest pro- 
portion of “objective” information”’ 
items is found in the Peer Area. The 
single contradictory entry in the table is 
the large number of no information items 
found for the objective criterion in the 
mental functioning area. However, 70% 
of these data failures occurred on Items 
14 and 15, which concern creativity and 
fantasy, and about which the objective 
data might be expected to have less to 
Sda\ 


CONCLUSIONS 


his study represents an attempt to evalu- 
ate with maximal precision the effectiveness 
of a particular projective test in the study 
of personality. The over-all goal of the re- 
search was to arrive at as exact a statement 
as possible, as to the amount, kind, and 
degree of accuracy of the information sup- 
plied by the Thematic Apperception Test. 

The conclusions drawn from the results 
of this study have to do with two aspects of 
Thematic Apperception Test validity: (a) 
the accuracy with which predictions can be 
made from the instrument and (6b) the use 


» Appendix B 


to which those predictions can be put. The 
actual scores, based on judge decisions, pro- 
vide the information for evaluation of the 
TAT; and the way in which the judges 
made their decisions—using both empirical 
and inferred evidence of judge perform- 
ance—offer clues to the way in which TAT 
reports are actually used clinically. 


Conclusions about the Test 


1. The TAT as a valid diagnostic instru- 
ment. In a sense, this was the central ques- 
tion asked by the research. However, not 
even critics of the test’s usefulness seriously 
doubt that the personality picture it pro- 
vides has a better than chance agreement 
with any dependable criterion of the sub- 
ject’s personality. The disproving of the 
null hypothesis of only-chance agreement is 
“central” largely because had it not been 
disproven, the remainder of the findings 
would have been meaningless. That TAT 
criterion agreement was so highly signifi- 
cant, in a study where differentiation be- 
tween subjects was so very minute, makes 
this finding somewhat more valuable. But 
it is generally conceded that all thoughtfully 
designed personality assessments can, prob- 
ably produce a psychological portrait which 
has some likeness to the model. What we 
can say, by way of original contribution 
from this single finding, is that our meth- 
odological procedures and statistical anal- 
ysis have apparently translated the clinical 
meaning of the data successfully into the 
“language” of experimental method and 
statistical inference. We say this because 
the evidence provided by this study about 
TAT validity agrees with the clinical find- 
ing that “it works.’’ Experimental statistical 
studies of projective test validation have 
not always been able to demonstrate this 
conformity. 

2. The TAT is a reliable technique in 
personality assessment. As noted in the dis- 
cussion of reliability, we are only justified 
in declaring the stability aspect of the con- 
cept of reliability, i.e., that the personality 
revealed by the TAT expresses the constant 
needs, dispositions, and attitudes of the sub- 
ject. About the consistency of the test find- 
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ings from administration to administration 
or from subject to subject and about the 
consistency of interpretation of test find- 
ings, we can make no unequivocal state- 
ment. However, the concepts of stability in 
time and consistency of findings—though 
empirically separable—are not independent. 
If it is demonstrated that a personality test 
reveals underlying trends in the personality 
which are not seriously affected by circum- 
stantial changes, or growth, or agiug, we 
can be at least more confident that the test 
will tend to produce essentially the same 
information, from administration to admin- 
istration. This is because the resistance to 
situational factors is being measured in both 
cases. 

3. The TAT can predict accurately, in all 
behavior areas, but is more accurate for 
some than for others. We have inferred 
from the behavior areas of greater and 
lesser agreement that the TAT tends to re- 
flect more the current conscious and un- 
conscious concerns of the subject than the 
purely unconscious and deeply repressed 
sources of these concerns. This conclusion 
is somewhat confounded by the fact that, 
for the subjects in this study, emotional 
adjustment is itself one of the focal con- 
cerns. Because there was no significant dif- 
ference in amount of agreement with the 
different criteria, it is not possible to assert 
whether the TAT consistently predicts more 
accurately at any one particular “level” of 
personality function. 


Conclusions about the Clinical Use 


the TAT 


The task required of the judges in this 
research is very similar to the clinical task 
of utilizing a psychological test report. In 
planning a therapeutic program, the clini- 
cian must decide from the information in 
the report: (a) The nature of the subject’s 
current social milieu (in order to fit in the 
current fact of therapy); (b) The family 
constellation (in order to select the best 
possible therapist and the best possible in- 
terpersonal approach) ; (c) The intellectual 
strengths of the subject (to estimate the 
amount of help the therapist can expect and 
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to decide on a realistic aspiration level) ; 
and (d) The subject’s emotional adjust- 
ment (for a detailed picture of the foci of 
the subject’s difficulty, his personal strengths 
and weaknesses, and the probable prognosis 
for, and course of, treatment). Although 
the goals were not the same, and the judges 
may have been less motivated to make the 
correct decision than in the clinical setting, 
the actual judgments made were substan- 
tially those which would be made by the same 
judge clinically. Thus, there is some justi- 
fication for inferring from the character- 
istics of the performances of these expert 
judges in the task at hand to the perform- 
ances of these (or other similarly trained) 
clinicians, in clinical utilization of the TAT 
report. There were several rather strikinz 
trends in the judges’ responses, which lead 
us to conclude: 

1. There is no single correct way of em 
ploying the TAT interpretation. There was 
little item agreement between judges, but 
each judge made enough “correct”’ decisions 
to yield a highly significant agreement fig- 
ure. Judges may arrive at essentially the 
same interpretive implications of the test 
report, by quite different routes; or judges 
may differ individually in their ability to 
utilize TAT predictions in different areas 
(supported by the finding of significant 
trio-area interaction) or for different sub 
jects (possibly supported by the significant 
trio-triad interaction, although the triad 
effect includes a differential “contrast” fac- 
tor as well). But whatever the route used 
or the implications noted, all judges arrived 
at a substantially accurate understanding of 
each subject, despite great interjudge varia 
tion. 

2. Judges (clinicians) appear to make 
more consistent use of the projective 
(TAT) report when the decisions they must 
make are definitive and detailed then when 
they are relatively simple, 1.e., because there 
is overlap in the behavioral repertoires of 
individuals, it is easy to make the mistake 
of including in a subject’s potential re- 
sponses those which are more characteristic 
of someone else. It is more difficult (and 
more inaccurate) to exclude from a sub- 
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ject’s potential responses those he in fact 
displays. Conversely, it is more difficult to 
make the first type of judgment correctly 
than the second. Although there were, as 
expected, more mistakes of the first type (D 
errors), the judges varied much more 
widely in their ability to make the second 
type of judgment (I judgment) accurately. 
We have concluded that the “‘set” necessary 
for making fine distinctions (concentration, 
deliberation, and impersonal, detached judg- 
ment) is more conducive to consistently ac- 


.curate diagnosis than the more spontaneous, 


personally colored approach used in mak- 
ing less tedious comparisons. 

3. The judges’ perceptions were sharp- 
ened by consideration of a variety of per- 
sonality data, but there is a point of dimin- 
ishing returns. Although a certain amount 
in judgment is produced by 
subject from several angles, 
“fatigue” effect produced by 
a superfluity of information, and this effect 
becomes dominant when the clinician con- 
siders many subjects. 


of facilitation 
examining the 
there is also a 


4. Clinicians have their particular “spe- 
cialties.” The accuracy of matching dif- 
fered, with each. judge, from individual to 
individual. There was, at the same time, no 
clear agreement between judges as to which 
individuals were judged more accurately. 
The exact cause of the divergence is not 
clear, but the crucial variable is apparently 
strictly personal. That is, it is explained by 
a difference in the way a judge perceives 
certain subjects or groups of subjects. It 
may be that the judge is especially expert 
at recognizing all hysterics; another judge 
matches boys better than girls; a third sim- 
ply finds certain individuals more interest- 
ing than others, and is thus more involved 
with those individuals. In any case, each 
judge fluctuated in the ability to judge dif- 
ferent individuals, and no two judges fluc- 
tuated in exactly the same direction. 

A projected subsequent study, based on 
judges’ responses to follow-up questions 
about attitude, work habits, etc., in the 
present research, may help to refine the con- 
clusions stated somewhat tenuously in this 
section on the clinical use of the TAT. 
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APPENDIX A 
FORMS AND TESTING MATERIALS 


TABLE 3 


PRESENTATION ORDER OF JUDGING TESTS 


| TRIO 2 TRIO 3 
(Judges 1, 4, | (Judges 2, 5, 8) (Judges 3, 6, 9) 


Test 


Triad | Subject Triad | Subject Sheets 


wn 


sii 


Vil | 
VI 
IV 
V 
IX 
I 
XI 


oN 


Om 


X 

II 
XII 
Vil 
VIII 
VI 
IV 
IX 
I 
XI 


oue 


J 


- 
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‘ 


ow 
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wn 
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X 
II 
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| 
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Go 
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1 
8 
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2 
6- 
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? 
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THE Jupcinc TASK 


A sample judging task is presented in the immediately following pages. The S is one of those 
actually used in this study. Materials are given in exactly the form and the order in which judges 
received them, and samples of all materials are included : 

1. Instructions to Judges (“Information for Judges’) 
2. Outline of Personality Areas 


t 
Tes | Test : 
| Triad Subject Sheets 
I A 
Il | 
III B 
IV B | 
VI Cc | 
Vil A 
VIII Cc 
x B | 
x B 
XI A | 
XII 
I B 
| 
Ill 
IV A 
V Cc 
VI B 
VII B 
VIII A 
IX | Cc 
X A 
XI Cc 
xu | 
I Cc | 
Il A 
A 
IV Cc 
B 
VI A 
VII | 
VIII B 
IX A 
X Cc 
XI B 
XU A | 
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. TAT Summary 


. Criterion Statements in Peer, Family, Mental Functioning, and Emotional Adjustment Areas, 
facing TAT Summaries of these Areas 


5. Sample Judging Form 


SAMPLE SCORING FORM 
TRIAD___JTRIAD___JTRIAD___|J TRIAD__ 


JSubject____]Subject 


Fig. 1 Judging Sheet. 


INFORMATION FOR JUDGES 


Each of the enclosed sets of papers contains : 
1. An outline of the personality areas covered in this study 
\ complete analysis of a TAT protocol, based on this outlin 


> 
3. A set of statements, also based on the personality outline, but derived from sources other than the 
TAT. The source is indicated, in each case 


What we are asking of you is, broadly speaking, a matching operation, involving a TAT interpre- 
tation compared against a set of statements, both covering the same areas of functioning. Some of the 
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1 1 i 
2 2 
4 4 
5 5 
6 6 
8 ic 
| 9 
10 10 ; 
11 
12 12 
13 13 
15 | 15 
17 17 z 
18 18 
5 19 19 : 
21 21 
»| 
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statements on each statement sheet will describe the individual whose TAT interpretation is presented, 
and some will not. You are asked to decide which of the statements do describe the TAT S. Each state- 
ment sheet is made up of a composite of statements about three individuals, one of whom is the S of 
the TAT analysis. Items for the statement sheet have been assigned randomly and independently from 
the separate statement sheets of these three individuals. Thus, the proportion of statements referring to 
each individual, as well as the order and position of items describing a particular S, varies from sheet to 
sheet. It is possible, for example, for an entire statement sheet to contain only items referring to the 
TAT S; similarly, it is possible for a particular statement sheet to include no items at all wHich describe 
the subject in question—or any proportion between these exttemes 


Instructions 


1. Before beginning the actual judging, it will probably be useful to study the personality outline, in 
order to familiarize yourself with the areas of comparison with which you will be dealing. 


2. Before starting a particular set of scorings, we ask you to read carefully the complete TAT sum- 


mary that precedes it, so as to acquaint yourself with the total personality of the individual, against 
which the statements are to be matched 


3. Rating : 

a. In addition to the complete analysis, the TAT summary is also reproduced, area by area, to 
correspond to a cluster of statements facing it. In each area, we ask you to read over the part of the 
TAT summary which faces the statements which you are rating 

b. Then read the first statement and judge whether, in your opinion, it describes the S of the 
TAT. If you believe it does, mark a “+” in the appropriate space on the Rating Sheet. If you believe 
it does not, mark a “0” in the space. If you are unsure, score the item “?.” (Note: all statements which 
are listed as “no information” may be scored “?.”) 


c. Make the judging of each statement as independent as possible of decisions you have reached 
on previous statements. 

d. Keep in mind, in scoring these statements, the fact that the population used in the study is 
equated for sex, age, education, and—roughly—for socioeconomic factors (all the Ss were selected 
from the same school). Differences may be slight and subtle and similarities frequent. 

e. Keep in mind also that the items scored “?” are statistically less valuable than “+” or “0” 
items. Do not use a question mark if you can possibly reach a decision. 


4. Check each set of statements to make sure you have scored all of them. 


OUTLINE OF PERSONALITY AREAS 


Soctal Interaction 
\. Peers 
1. Degree of participation (integral member, isolate, active participant, number of mutual choices, 
social success, amount of acceptance, etc.) 
2. Nature of participation (leader, follower, conforming, rebelling, dependable, enthusiastic, spon- 
taneous, etc.) 
3. Quality of relation to peers (group warm/cool towards him, stimulus value high/low, warm 
cool feeling toward others, trusted/not-trusted, attacking, nonattacking, self-justification needs, etc.) 


B. Family 


4. Feeling toward mother and/or personality of mother 


5. Feeling toward father and/or personality of father 
6. General family atmosphere (emotional attitudes, kind and degree of regularity, etc.) 


Sibling relationships 


Relationship between mother and father 
Social relation to the Community 


Characteristics of the Self 


\. Mental Functioning 
10. JQ (potential functioning) /present level of functioning (school work, efficiency, etc.) 
11. Mental approach and organization (judgment of level of organization and qualitative elabora 
tions over/under-generalize, over/under-relate, compulsive trend, premature closure or attention to 
wholes only, etc.) 
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12. Motivational factors (ambition, energy investment in work, positive interest in achievement, 
accomplishment, conscientiousness, over /under-scrupulousness, etc.) 


B. Imagination 
13. Quality of imagination and creativity (high/low/mediocre creativity, originality, amount of 
restraint/freedom, etc.) 
14. Fantasy as an escape (some, none, limited, much, etc., use of fantasy as escape.) 
15. Nature of fantasy—in general (stereotyped, self-glorifying, omnipotence, etc.) 
C. Patterns of Emotional Adjustment 
16. Basic emotional attitude (active/ passive ) 
17. Impulse life acceptance (accept/reject, important/unimportant, etc.) 
18. Anxiety 
19. a. Presence and degree, or absence, and form taken 
b. Source or primary locus 
D. Mechanisms of Defense (Control System) 
20. Conscious control/inner control (behavior directed primarily by conscious consideration/inner 
needs) 
21. Outer control (amount and ease of conformity with outside demands) 
22. Special mechanisms (comforting devices: achievement, fantasy, sensuousness, etc.; other ego 
defenses: isolation, reaction formation, etc.) 
E. Emotional Reactivity 
23. Inner amount of introversion (extent of inner preoccupation, and possibly more specific focus, 
reason, or nature of inner preoccupation) 


24. Outer-extroversive tendencies (amount of interest in the outside world, things/people, action/ 
thinking, etc.) 


25. Potentiality (what is likely to happen to the balance of 23/24) 
F. Sexual Adjustment* 


26. Anxiety and general adjustment (anxiety, hesitancy, fear, etc, amount and kind, excepting 
comments more appropriate to 27, below) 


27. Role (aggressiveness/passivity, amount of interest, participation: social/phallic, etc.) 


Tat SUMMARY 
Triad IV 
Individual A 
I. Social Interaction 

A. Peer relationship. His peer relations should not be good. He is still too concerned with parental 
problems and how to conform to work out ties with peers. His control is quite gross and his techniques 
are equally inept. 

Overt pattern should be one of some conflict with others. This arises for several reason 

1. His strong ego and strong inner life 


2. His resistance to parental demands 


3. His guilt over his resistance 

s a bov both mature and immature—of strong undisciplined impulsivity that is suppressed by his 
uper-ego-demands of parental conformity. In his overt behavior his conflict will be of the willful, 
sistant, and demanding sort 


Bh. Family Relationships 
Intrafamily dynamics. Mother relationship is a difficult one. The tie is largely a fixation wherein 
his own spontaneity is viewed as rebellion against the mother and hence arouses guilt. The mother 
seems rather subtly demanding and oriented more toward achievement demands than to repressive impulse 
control 


*In analvzing the judge decisions, Items 26 and 27—Sexual Adjustment—were considered to be part 
if the “Peer Interaction” Section and were so counted 
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The father tie is a very difficult one in that he is a man of firm and uncompromising demands 
which S tries to obey. The boy views him as demanding and firm and also as opposed to S’s affectionate 
interest in the mother. It is the father who keeps him away from the mother, he feels 

His high narcissism and infantile emotionality reflects a fixated primary 
not by overaffectionate protection, but by its lack. I presume the tie to have 
unloving in the early years. 


tie to the mother marked 
been rather routine and 


The continued longing for affection, plus the high awareness of « trol, plus the reprisal for self- 
affront (3BM-18BM) suggest the home atmosphere has been one orienced by strong-willed parents more 
concerned with their own interests (i.e., mobility) than with the affec.ion of their children 


II. Physical Characteristics 


1 


A. Energy Output. S has a rather strong emotional energy outpv, the expression of which 
with rather gross manipulatory, large-muscle movements 


B. Growth Pattern. The TAT syefests that he is post-pubescent and actively oriented heterosexually 


III. Characteristics of the Self 


\. Mental Functioning. S's intellectual level is superior (120-125 IQ). His intellectual efficiency, 
however, is at present greatly reduced. His emotional preoccupations are no doubt responsible partly for 
this. He shows a lack of ability to discipline his concepts and a low use of whole inclusive concepts 


though he can be forced into them. He prefers details and is a keen observer of them (Pic. 18) 


School adjustment could be very good under certain circumstances where a firm but obviously affe 
tionate teacher could encourage his rich inner life and have him discipline himself. His low efficiency, 
however, is not conducive to good school work in the traditi nal sense, and he is not really interested 
in intellectual matters, His interests lie much more in the manipulation area (Pic. 2). 


B. Imagination. S's imagination is a good quality but 


irdly functional because of his poor discipline 
and organization and because of his guilt over any form « 


spontaneity or impulsivity. 
He has considerable potential here that is hardly being 


used 


\ctive fantasy life that takes up some of his energy but that is not completely effective as control 


C. Patterns of Emotional Adjustment. Basic emotional attitude is slightly aggressive. He maintains 


a strong ego that is hampered by guilt but that is still struggling activels 


Impulse life is rich and accepted. It is confused by his guilt over nonconformity, but this is not 
impulse-repression 


S has many signs of anxiety and feelings of insecurity, particularly in his peer and in his parental 
ties. He describes people separately and does not combine them into cooperative congenial groups. His 
descriptions of people are also generally of restraint and conformit 


, of worry, of disappointment 


System of control is heavily weighted to the introversive. His 


stricted) and his outer control is only partially effective. Inn 


ious control is high (not con 


er control is used but his resources here 
ire so rich that his ego-demands continually overburden his outer contt 


Interpersonal relations are his bad point. His view of people as either not loving enough or as 
lemanding makes it hard to establish contact 


t. He is so preoccupied that he is also not making much 
effort at this area (low c in Ror?). 


S has sexual interests that are developing. His infantile attachment has not been impulse-repressing 
in light of the achieving-demanding (rather than high! | 


IV 


Summary 


Chis is a boy currently preoccupied with the following themes: 


1. Longing for protection and love from mother 
2. Resistance to accepting demands 


3. Guilt over his own resistance 


He is a boy of superior intellig 


his “self” that he has no energy left for working out interpersonal ties. The 


suggested are 


and high creativity who is so restrained and apprehensive over 


dominant causative factors 


1. Early lack of maternal love 
2. Too rigid achievement demands 


3. Father attitude too firm and demanding 


4. Early lack of emotional contact wit! 


¢ 
1 tTatner 


for it area appeared on the 


Individual 


\. Peer Relationships 

His peer relations should not be good. He is still 
too concerned with parental problems and how to 
conform to work out ties with peers. His control 
is quite gross and his techniques are equally inept 

Overt pattern should be one of some conflict 
with others. This arises for several reasons. 
1. His strong ego and strong inner life 
2. His resistance to parental demands 
3. His guilt over his resistance 

S is a boy both mature and immature Str 
undisciplined impulsivity that is suppressed by his 
super-ego-demands of parental conformity. In his 
overt behavior his conflict will be of the willful 
} 


lemanding sort 


esistant ane 


B. Family Relationships 
Intrafamily dynamics 
Mother relationship is a ditticult one. The tre ts 

largely a fixation wherein his own spontaneity 1s 

viewed as rebellion against the mother and hence 
rouses guilt. The mother seems rather subtly 
demanding and oriented: more toward achievement 
demands than to repressive impulse control 

The father tie is a very difficult one in that he 
is a man of firm and uncompromisive demands 
which S tries to obey. The boy views him as 

demanding and firm and also as opposed to S’s 

iffectionate interest in the mother. It 1s the father 

who keeps him away from the mother, he feels 
His high narcissism and infantile emotionality 

reflects a fixated primary tie to the mother marked 
ot by overaffectionate protection, but by its lack 


presume the tie to have been rather Iti and 
unloving in the early years 
he continued longing tor affection, plus the gh 


awareness of control, plus the reprisal for self 
affront (3BM-18BM) suggests that the home at 
mosphere has been one oriented by strong-willed 
parents more concerned with their own interests 


(mobility) than with the affection of their chil 


1 
ren 


Triad 1V 


Sample Matching Form 
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fined that one has 


has been enjovit 


in it 


group throughout 


for matching presented to each judge was stapled in such a manner that the TAT summary 
left and the corresponding statements to be 1 } 


ratched appeared on the right 


spontaneous participation.” 


re on leadership when he 


apparently does not participate very actively 


the S have a positive role 


ost in a scapegoat role 


ted two tronts, first extreme 


ndifference or withdrawal.” 


i warmth and trust in him 


t 


strong impression that he 


g a top status position ina 


is a participant 


ther serves apparentiv as a 


buffer between the tather and the son 


lowever is a strong indication 

iffectio bot Mother works 

he tamily’s op, painting and do 

w other forms ) manual labor In 

Idition does some of the office work 

Mot tat is to man’s 

ominant pla in the The 

other 1s small and pretty and showed 

i great cal o ~;wperation to the 

interviewer! 

Int “She talks with ¢ t intensity clasp- 

ng he ul vetore her and sitting on 

e edg ) hair. She seemed very 

ich moved at times by things she was 

in i very pleasant, charm 

iW pers who undoubtedly verv 

Ob t in au atic attitude 

rhe father det nds a good deal of 

) in | shop work and in general 


eems prone to ignore good work and 


ure especially 
his father isapproval 
rent rence to the 
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| lest Form 8 (Observation and Interview) ee 
Family Dynami : 
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very cord 


when the 


FARLI 


an 


one 


rorm of ¢ 


with a ta 


conversation and manners he seemed a: 


aftable so 


r aomine 


ch 


Mot! 


ound respect and affection tor 


is that his father treated him as a cor 


plete equal in many instances, but that 


) opi 
he w 
ri 
won 

ible 


crs re 
he ran away from home about 
ago her dad was sé | 
il t to even look for him this time 
ss they have a terrible time get 


int 

M hi 
most 

they just 

savs (S's 

more sO 

have 

hit athe 


vou think vou'’re doing it that) wa 


eason. | 


us breath 


his ta 


Ss very attractively painted and fur 


re was a difference ot opinior 


do a thing.” 


nily lives 


d son siding together agains 


r and sister. This t 


ritical teasing.’ 

tall heavily built voung 1 
ce rather like (S’s). He was 
ial and friendly By 


rt of fellow, not overbearing 
‘ering in any forn Mr 

never paid much attentior 
ildren when they were litth 


rer says that (S) has pre 


The impression | gaine 


were ive veig! 
as secretly pleased whenever 
rhit (S) spoke with ad 
ot his father and his remar 
ory at the shop although 1 


incident he mentioned how h 


iction to (S’s) runm 


she says her mother 


s father was so angry witl 
of the time. He just wouldn't 
(S’s) acting that way and 
couldn't get along he 
) doesn’t know how to handl 
he shouldn't be allowed 
His father will say t 
r sharply, ‘Now just what ck 
S) will always have a good 
lis father will mutter under 
to anyone else, ‘He's a bette: 
than most any grown man 
ther won't stand for 


parents, but somewl 


frame house the inter 


almost entirely due to the 
(mother) and her hust 
very interested in childret 
education. Children have 


ire expected to do them.” 


her 
the the j 
Int 
4 
hic : 
fat 
father would” give his commands 
d shortly to him.” 
T.: Reporting an account by (S’s) sister oi 
| 
ting a 
tri 
tor ul 
reason 
mecha 
sermse [To 
: 6& Obs VY) goes to the store Carnes out 
Takes are ot the garbage 
rieips ither Probably held near 
Int.: “The in a relatively 
new White 
: nished 
(Parents) 
ind their 
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Each have 
elr own nds, do not do things 


up 


bie teased is let sister, but a 


always existed 


etween thr esp bickerings 


osity. betwee 


Ways ovel 


“ r husband and brothers.” 

9 Sub: “I lke tl place. If lived the 
intry there wouldn't be anything to 

3 li I lived u itv like Chicago | 


vouldn’t hav urd and I'd live in an 


TAT Summary t Fort Rorschach ) 


Patterns of Emotional Adjustinent Pattern motional Adjustment 
Basic emotional attitude is slightly aggressive 16. Basic emotional attitude shows signs of con 
He maintains a strong ego that is hampered by traint. There 1s 1 reluctance to admit the 
guilt but that is still struggling actively xistence of emotional probler 
Impulse life is rich and accepted. It is contuse Underlying submissive tendency possibly which 
by his guilt over noncontormity, but sis not Ss in cont vit vhat ie, a oy, IS expected 
impulse repression to do 
7 str ) 1} ept ‘ n 
has many signs of anxiety and feelings ot in pt-dominated 
pulse 1 terms of constrictis ontt 1 
security, particularly in his peer and in his parental — ? 4 ive control fun¢ 
ties He describes people separately and does not ions, both ba CXISt SI by e, as it were, 
constant element of conflict 
combine them into cooperative congemal groups and provi wc ie ‘ 
His descriptions of people are also generally of 18. While there are evidences of anxiety, chiefly 
restraint and conformity, of worry, of disappoint- n terms of negativist wi anxiety which is 


nent 


System of control is heavily weighted to the in 


troversive His conscious control is high (not ; ait 


19. As the source of anxiety, we ght postulate 


onstricted) and his outer control only partially ; 
iffective. Inner control is used but his resources guilt over resistance to paternal COntTO 
ere are so rich that his ego-demands continually 


mbivalent 


feelings tow i and possibly a 


ynponent 


} } rt host side from concert hout hetero 
verburden his outer control f covert hostility, aside ot nee it heter 


xual relations 


Interpersonal relations are his bad pomt. Chi 
view of people as either not loving enough or as ; ; 
lemanding makes it hard to establish contact. He, 21. (S ten GO TUS, 
is so preoccupied that he is also not making much be conscientious, the proverbial “try and try again 


attitude of rigid and righteous perseverance 


effort in this area (low C in Ror?) 


22 Phere OO; tv ot seeking 
S has sexual interests that are developing. The 
Le, propelled) sensuous aesthetic pleasure throug 
act that his infantile attachment has not been 
a real ‘ il aAtista lot 1 ain 
impulse-repressing 1S interesting in light of the 
Ing lve rot mia Who ive 
whieving-demanding (rather than highly repres 
sutticient i to de rie other ubstitute 


sive) mother 


he probably does not 


involve inter 


Inner 


ae 
| t ve average 
me 3 
t iren are 
Ca talking about too.” 
together i uy ght some, but stick 
for one another 
ther 
tment 
apartine 
| 
than 
some deep seated sturbance that is taken on 
| | : 
23 Phere an underlying 
24. At the present tine, 
himself really to become 
c personal relations, due to preoccupation with —_—_~ - rt 
emotional problems and as a protection against in . ; = 
fringement ego intecri! 
| 


EXAMPLI 


statement obviously refer to different Ss. 


As before, the TAT summary is given to the left, the 
listed to the right, with the correct statement immediately below the incorrect one: 


Individual A 


Priad 


Family Dynamics 
The family picture is a rather detached and in 

personal one. The dominant feature should include 
a mother to whom there may or may not be som 
hostility, but who is seen somewhat as an imper 
sonal nuisance. The mother is clearly present and 
accepted only as a controlling person. It is as 
though the mother had very low emotional contact 
with him, but did assert herself so that he cannot 
afford to ignore her or be hostile to her. The 
father relation seems equally or even more distant, 
vet at this distance, seen as a desirable object. He 
has almost a devotion to this distant father image 
The total family picture does not stress warmth or 
unity of interpersonal relations. There is the dis- 
tinct probability of direct sibling hostility 
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oF “H1iGH-CONTRAST”’ 


The following set of data is an instance of high constrast between Ss. The TAT summary and criterion 


In terms of his childlike emotional dependency, 
which is instances, | would not 
think that S is capable of close interpersonal ties 
at this time 


denied in most 


I would expect him to form fleeting associations 
to get emotional support; i.e., he'll associate ‘with 
different pals to do things, rather than choosing 
people to be with 

25. No information 

26. There seems to be considerable anxiety with 
regard to sexual impulses 


27. In terms of progression into social sex role, 


he is decidedly in a period of transition, somewhat 
handicapped by his 
maturity. 


relative emotional-social in 


MATCHING 


incorrect and correct criterion statements are 


Triad I1—Test Form 7 


(Subjective—Interview & Observation) (incorrect ) 


Dynamics 


Family 


6. O rhe 
attractively 


house is quite an old house, but 
furnished 


room, magazines, etc 


a good many books in the 
living one gets the im- 
pression ot this family as very secure, affectionate 
home atmosphere, in which the parents are secure 
and permit the children to become so. Apparently 
they take real pride in the children’s behavior in 
Both Mr. and Mrs are very 
easy to talk to and talk freely and naturally. We 


yl 


did not confine our conversation to discussion ot 
the children, but talked about national aftairs, 
books, et 

Criad I—Test Form & 


Interview and Observation) (correct) 


Family Dynamics 


6. Obs: Family lives in a small one story house 
which looks well kept up from the outside. Inside, 
it is extremely disorderly. All of the young chil- 


dren were very dirty, both clothing and faces, and 


they didn’t look as if much attention had been 
given them. Mother revealed that the kids tear 
around the house a good deal and have broken the 


windows many times The babies were always 


dirty, with runny noses and wet pants 
Mother: I don't let him keep all that money 
(earned on a job). I usually have him give me 


put it away; he gets to keep the 50 


nt He eats pecky, doesn’t like vegetables or 


nothing raw. (“Does he get a piece of fruit every 
He drinks (milk) 


a quart at a time 1! ou 


guess not when 


drink 


we have tt He'll 


(Subjective 
t! 
let 1 
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EXAMPLE oF “Low-ConTRAST” MATCHING 


The Judges were asked to match items taken from one of the criterion tests with relevant portions 
of a given TAT summary. In some instances, two or more individuals from the same triad showed 
much similarity in personality structure. The criterion statements of such individuals are necessarily 
similar also. In those cases of low contrast, even where the criterion and the TAT do not-refer to the 


same S, the accidental similarity between the two 


statements might lead the judges to believe that 


statements had a common referent. The judges’ response would probably have been to score the items 


incorrectly as referring to the same S 


In the instance of low contrast cited below, the 
statements to the righi—first the incorrect statement, 


Triad 11—Individual 


Peer Group Relationships 
S’s relationship with peers is extremely poor 

poor in the sense that he has few positive ties 
to them and in general he is received negatively 
by them. This is an instance where the social 
stereotypes will probably operate strongly against 
him to give him a reputation of rejection. Actually 
he has very little hostility against his peers, but his 
feelings of self-martyrdom, his strong introspec 
tion, and his rather sensuous and full inner vitality 
will be received with suspicion by the group. | 
should expect that his sociometric scores would be 
heavily on the rejection side. While he is a good 
object for scapegoating, I doubt if he received 
much of that kind of attack at present 


TAT summary is given to the left, the criterion 
and immediately below it, the correct one: 


Triad 11—Test Form 3 
(Objective-Sociometric Test Analysis) (incorrect) 


2 


occasional flare-ups of temper. . . 


is regarded as a weak, distasteful person, 
not to be trusted.” 


Triad 11—Test Form 2 
(Objective-Sociometric Test Analysis) (correct) 


3. “The group shuts him out but does not seem 
to punish.” 

“His classmates respond to him negatively ap- 
parently not for what he does or says but because 
of some ‘queerness’ which they find distasteful.” 

i does not seem to be the object of hostility 
or scapegoating, but rather of undefined distaste or 


VALIpITy OF CRITERION INSTRUMENT 


The degree to which criterion instruments “measure what they purport. to measure” limits the useful- 


ness of the statements produced by these instruments in testing the validity of the TAT predictions 


The proyective instrument used (Rorschach) is a 


“validity” is used psychologically. Its dependability 


power in “normal” Ss, -is too well known and_ the 


listing of such findings here 


well-validated test, in most of the senses in which 
in clinical diagnosis, as well as its differentiating 
validation studies too numerous to warrant the 


The subjective instruments (observations and interviews) are directly extrapolated samples of the 
J 

behavior level they serve. As such, they do not “measure” or “stand for” the behavior they describe ; 

they are that behavior level. A “validation” of the observations and interviews would be meaningless for 


the purposes of this study. 
The same reasoning applies, in the present researc 
The Sociogram and the Guess Who Test are felt t 


h, to certain of the objective criterion instruments. 
vield objective information in the Peer Area. These 


tests tap directly the stimulus value and group interaction of each S with his peers Mental func- 


tioning, in the objective data, is represented by well 


standardized tests of intelligence and achievement, 


t 
by school grades, and by teachers’ observations (the latter two, lik the Sociogram and Guess Who, 
expressing directly the content with which they deal). However, two “objective” instruments the 


] 


Family Relations Questionnaire (yielding data in the Family area) an 


he Emotional Response test 


(producing Emotional Adjustment statements)-—require comment 


1. The Emotional Response Test." This 
the S’s statements concerning activities and persons 


sadness, anger, fear, and shame” (Davis, 1953, p. 14) 


* This test was developed for a nparative study 


is a series of questions asked by the tester to elicit from 
who produced in them the emotions of happiness, 


The information asked of the S is introspective, 


»f moral ideology and emotional attitudes in three 


societies (5). Details of administration, scoring, categorization, and reliability testing will be found in 
this study (pp. 14-23, 623-651) ; the nonqualified “validational” material dealing with the relationship of 
ERT responses to observed behavidr can also be found in the section of that study which treats results 


(pp. 350-380, and summarized on pp. 602-612) 


Ne 
2 
‘ 
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and requires the S to report feelings rather than exhibit behavior. Such material is subject to more 
distortion than the content of, e.g., the Sociogram or the Guess Who Test, and we cannot accept it as 


automatically as a literal statement of the S's feeling reactions 

Measurement of the reliability of the test (defined as consistency of response categories in two 
matched groups) gave correlation coefficients of .59-.66(+.03) for “Anger,” of .63—86(+.01), for 
“Sadness,” ,and of .93-.95(+.01) for “Happiness.” No formal validation of the ERT has been reported, 
but its general agreement with the results of other tests used in the same research and its usefulness it 
elaborating and integrating the psychic hypotheses to which the research was addressed provide sufficient 
justification for its inclusion in this study. Furthermore, the interpretive procedure and the reliability 
standardization of the ERT were performed on an age group which is approximately the same as the 
Ss of the ‘present study, which should provide maxin 


relevance its application to this adolescent 
population 


2. The Family Relations Questionnaire. The validity 


of the FRQ as a diagnostic instrument has beet 
tested. The responses to the Questionnaire were interpreted t 


» yield information about specific traits of 
the S and about more general personality trends. These interpretations were compared with inde- 
pendent estimates of the traits derived from a battery of other tests (the “Guess Who Test.” the 
“Portrait Guess Who,” the “Check List,” and the “Character Sketch” 


r ) and with broader personality descrip 
tions derived from clinical ratings and from more global personality tests (e.g., the California Perso 
ality Test), with an average trait correlation of .67 (range .50-.79). These measures of agreement were 
not ideally high, but are consistent. Since, in the present study, we are using the FR responses without 
interpretation as a simple report of family interaction, these 


correlations obtained from more exacting 
use of the test are ample ground for its inclusion as 


one ot our criterion instruments 


| 
j 
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APPENDIX B 

TABULAR AND GRAPHIC PRESENTATION 
OF RESULTS OF THE STUDY 
TABLE 4 


Analysis of Variance Block Scores 


| SCORES 


\rea Scores* N Xr Subject Scores*® N xX» 


24154 972 24.849 1 2252 108 20.851 


22 i 14.323 3 1529 108 14.157 


25531 


76517 


* Standard error (area) 


Trio 


30460 1296 23.503 10 2370 108 
2 21551 1296 16.628 11 1861 108 ; 
3 24506 1296 18.908 12 3103 108 ; 
13 2706 108 
76517 14 2054 108 
* Standard error (trio) ; 


Triad 


4642 324 14.327 18 1468 108 13.592 * 
0443 324 19.885 19 1220 108 11.296 t 
5469 7 


* Standard error (triad 


Instrument 
26532 1296 20.4 38 10 108 19.509 

4 24078 1296 18.578 844 108 7814 

25907 1296 19.989 

¢ Standard error (subject 3.33 + 

76517 ‘ 

¢ Standard error (instrument . 


Order 


1 28507 1296 1.966 
2 25102 1296 19.368 
3 22908 1296 17.675 rae 
76517 


¢ Standard error (order) 


Sequence 


900 
8 372 16.091 
10.645 


© Test first in series (for the individual si 


1 Test preceded by Projective and Objective matching 
2 Test preceded by Projective matching . 
3 Test preceded by Projective and Subjective n 

tr 4 Test preceded by Objective and Subje 
S Test preceded by Objective and Pr 
6 Test preceded by Subjective matching a 
7 Test preceded by Subjective and Obje e matching 
& Test preceded by Objective matching x 
9 Test preceded by Subjective and Projective matching : 


* For definition of “score” see text, pages 29, 30 


» Since Chance = 0, the Means Column (X) represents the percentage of correct judgments in each category greater than chance 


JS Sr SSe 1 
¢ Formula for standard error V ( vO assumes equal variances in all levels of the factor 
arr a ( 


4 
| 4 1669 108 15.453 ve 
6 2477 108 22.935 
12117 7 1798 108 16.648 
1 
2 
3 
4 7834 324 24.179 21 1609 108 14.898 
5 7984 324 24.641 22 1909 108 17.675 nite 
6 5514 324 17.018 23 1199 108 20.361 
7 3386 324 10.450 24 1901 108 18.601 f : 
6009 324 18.546 25 2490 108 23.085 
9 9008 324 27.862 26 3637 108 33.675 cae e 
10 7585 324 23.410 27 3881 108 26.675 
8172 324 25.222 28 3009 108 27.861 
12 4471 324 13.799 29 1853 108 17.157 
30 2723 108 25.212 2 
76517 31 2384 108 22.074 at ig 
' 
980 
0 26147 1296 20.175 
1 7349 420 17.497 
9696 456 21.263 
3 8112 420 19.314 
4 8957 384 23.325 3 fe 
6 7TRAS 396 19.785 
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TABLE 5 


Analysis of Variance Block Scores 


D Scores 


X 


Subject 


11.000 


3 928 972 12.271 3 1614 108 14.944 
7 


51918 
> Standard error (area) 


Trio 


18814 1296 14.516 10 1724 108 15.962 
16289 1296 12.568 11 1516 108 14.037 
16815 12.97 


51918 
\ Standard error (trio) 


Tris 18 1616 108 14.962 


3 3892 324 12.01 

4 4809 324 14.84 

‘ = 180 108 13.935 

3 1870 108 14.537 

» 

9 324 16.611 ; 163 108 

12 3447 3) 0.638 29 1531 108 14.175 > 

2 3447 10.63 1420 108 13,148 


31 1561 108 14.453 
51918 3 1260 108 11.666 
> Standard error (triad) 1.063 33 10 108 19.509 
34 99 108 9.232 
Instrument $5 1422 108 13.166 Fah 
1 17195 1296 13.267 6 1028 108 9.518 i 
2 16664 1296 12.858 $ 
, 18059 1296 13.9034 > Standard error (subject 1.83 


51918 


Standard error (instrument 


18905 1296 14 
16555 1296 12.773 

3 16458 1296 12.699 
51918 


Standard error (order) 


Sequence® 
0 17177 1296 


51918 


® See legend in Sequence table for I scor 


SS: SS 1 
» Formula for Standard error (y = ( - o)) assumes equal variances in all levels of the factor. 
dig 


4 
\rea Scores N X Scores N 
1 15230 972 15.668 1 LIS88 108 
1474 108 13.648 
6 1402 108 12.981 oe 
= .Ol4 1119 108 10.361 
1542 108 14.277 
9 1231 108 11.398 
1 
2 
l 0 108 15.805 
14 1439 108 13.324 
16 63 108 5.898 
533 
Order 
13.253 
me 1 5163 420 12.29? ie 
2 6734 456 14.76 
4 S716 14.885 
5 ROS 22.361 
6 446. 396 11.26 = 
9 741 4s 15.437 
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DISTRIBUTION OF "D" SCORES 
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INTERJUDGE AGREEMENT 


Trio #1 brio #2 


is 


(Judges la Judges Arn 


Judge 4 Judge 8 


Like Unlike Like Unlike Total 


Like 37 Like 


Unlike 3 Judge 2. Unlike 277 1005 


Total Total 1108 1381 


and 7) Trio #3 


Judge 7 Judges 3 and 6 


Like Unlike 
Potal 
Like 


Unlike 


Total Judge 3) Unlike 
ola 


Like 
Judge 2 Unlike 


Total 2500 


APPERCEPTION TES 39 ix 
ie 
PABLE 6 
282 
1282 
75 
Judge 1 
? 
Total 1094 1338 2432 
516 
150 
4 Trio #2 
Judges 2 and 5 Judges 3 and 9 
Judge 5 Judge 9 
Like Unlike Tota Like Unlike | Tota 4 
774 151 1225 Like 73 151 1188 
$33 842 1275 Judge Unlike 234 1016 1250 
Total 971 1467 438 
¢ 442 : 
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rABLE 7 


OccuRRENCE OF ITEMs OF “‘No INFORMATION” SUMMARIZED FOR BEHAVIOR AREAS AND 
CRITERION INSTRUMENTS 


Criterion Instruments 
Area of Statement Number of Items and Percentages 
| 

ships | Included Totals 

Pencil-and- Observation 

Rorschach 

and Interview 


Preferential 
Occurrence of 
Information” 
to Particular 
Items 


21.4% N 108 42.8> ; | 100% of “No In- 
formation” items 
for this area found 
in Items 26 and 27 


50° of Infor- 

Intellect mation” items for 
| this area found in 

Item 14, 70°; 

| found in Items 14 

and 15 


75% of ‘‘No Infor- 
Emotional -2§ | mation” items for 
Adjustment | 56.3 this area found in 
Items 19, 22, and 
25 


Totals 1431 782 1134 3347 


*Entries in this column indicate percentage of total Instrument items of no information contained in the particular Area-Instrument 
block. 


Entries in this column indicate percentage of total Area items of no information contained im the particular Area-Instrument block 


| 
Peer group 1-3, N 90 | 35.7% WN S4 
; 26, 27 - 
6.3* 
N 342 | 42.3 | N 404 | 50.0' N63 | 7.8| 809 | 
Family 4-9 — 
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